C writing readdir to char array variable? - c

I am attempting to write a directory list into a char array but getting segmentation faults when attempting to use strcpy or strcat. Is there a better way to go about this?
I am just wanting to modify the following to create a string instead of printing to stdout. I am guessing I am just missing something really simple, but I have not been able to pin it down.
#include <stdio.h>
#include <dirent.h>
int main(void)
{
char returnData[2048];
struct dirent *de; // Pointer for directory entry
// opendir() returns a pointer of DIR type.
DIR *dr = opendir(".");
if (dr == NULL) // opendir returns NULL if couldn't open directory
{
printf("Could not open current directory" );
return 0;
}
// Refer http://pubs.opengroup.org/onlinepubs/7990989775/xsh/readdir.html
// for readdir()
while ((de = readdir(dr)) != NULL)
printf("%s\n", de->d_name); //strcat(returnData, de->d_name); produces segmentation fault here.
closedir(dr);
return 0;
}

First change:
char returnData[2048];
to
char returnData[2048] = { '\0' };
As already mentioned in the comments, you should initialize your Array with Zeros/NUL-Terminator, so the call to strcat is defined as strcat
replaces the '\0' with the src parameter.
And as some compilers complain use strncat or similar instead of strcat.
Also don't forget, that you also need to append '\n' to get the same output as with your printf.
You could either calculate the length beforehand resulting in two loops
or resize the buffer dynamically.
BTW: Why do you want to store it in a single string?

You are missing a couple of things. First don't use magic numbers... Where did 2048 come from? (a lick of the finger and holding it up in the air and saying "yep, that should be good enough"?) The limits.h header provides the macro PATH_MAX that is guaranteed to provide sufficient storage for all filesystem entries -- use that instead, e.g.:
#include <limits.h> /* for PATH_MAX */
...
char returnData[PATH_MAX] = ""; /* initialize to all zero is good habit */
(next, a typo I'm sure, but null is not NULL)
If you simply want to copy de->d_name to returnData, then use a function that will copy de->d_name to returnData, like strcpy, e.g.
while ((de = readdir(dr)) != NULL) {
strcpy (returnData, de->d_name);
puts (returnData);
}
(of course, it is overwritten on each iteration, so you are not going to return a list of files using returnData, notwithstanding the fact that returnData is declared within the current function with automatic-storage and if declared within another function could not be returned to begin with...)
So, all this beating around the bush to copy de->d_name to returnData has left you exactly where you began, being able to do no more than output the name of an entry one at a time.
Actually Allocating Storage For Each Directory Entry
What I suspect you are really wanting to do is to read all files in a directory into storage in a way you can return the list of names from a function for further processing within your code. That is common practice, but not something you can do with a single character array.
Instead, you need to declare a pointer-to-pointer-to-char (e.g. a "double-pointer", char **dlist;) which will allow you to allocate for some initial number of pointers (say 8) and then realloc more pointers as required to accommodate all the file or directory names within any directory. You then allocate only the storage required for each name (+1 for the nul-terminating character) and assign the storage for each name to its corresponding pointer and copy the name to the new storage you allocate.
That way you can then return a pointer to your collection of names from wherever you like. (remember objects will allocated-storage type have a lifetime that continues until the memory is freed or the program ends) Something like:
#define NPTRS 8 /* initial number of pointers to allocate */
...
char **dlist = NULL, /* ptr-to-ptr-to-char for names */
size_t idx = 0, /* current index */
nptrs = NPTRS; /* number of pointers allocated */
...
/* allocate/validate nptrs pointers */
if ((dlist = calloc (nptrs, sizeof *dlist)) == NULL) {
perror ("calloc-nptrs");
return EXIT_FAILURE;
}
while ((de = readdir (dp))) {
...
/* check if dlist pointer limit reached - realloc */
if (idx == nptrs) { /* alwasy realloc to temporary pointer */
void *tmp = realloc (dlist, nptrs * 2 * sizeof *dlist);
if (!tmp) { /* validate reallocation */
perror ("realloc-dlist");
break; /* break, don't exit, original storage still valid */
}
dlist = tmp; /* assign reallocated block to dlist */
/* (optional) set all newly allocated memory to zero */
memset (dlist + nptrs, 0, nptrs * sizeof *dlist);
nptrs *= 2; /* update the number of allocated pointers */
}
/* allocate storage for name in dlist */
if ((dlist[idx] = malloc (strlen (de->d_name) + 1)) == NULL) {
char errbuf[PATH_MAX] = ""; /* storage for perror message */
sprintf (errbuf, "malloc failed '%s'", de->d_name);
perror (errbuf);
break;
}
strcpy (dlist[idx++], de->d_name); /* copy to new storage at idx */
}
Now you have all names stored in dlist where idx indicates the number of names stored. You can return dlist from any function (you will also want to return idx through a parameter so the number of files stored is available back in the calling function as well, or move the reallocation below your copy (and include the 'optional' memset) to insure you always have a sentinel NULL pointer following the last valid entry -- which provides another way to indicate the valid names returned.
As you have (or will) find, readdir does not read the directory entries in any particular order. To be useful for output, sorting with qsort is the easiest way to order the filenames you have stored. A simple sort-ascending is shown in the example below.
Putting it altogether, you can read the entries in any directory (passed as the 1st argument to the program or from '.' (current dir) by default). The code will allocate pointers and reallocate as required. The code allocates exactly strlen(de->d_name) + 1 characters of storage for each entry, assigns the new block of memory to dlist[idx] and then copies the entry to dlist[idx] (you can use dlist[idx] = strdup (de->d_name); to allocate and copy in one step if your library provides strdup -- but remember strdup is allocating memory, so you should validate is succeeds before proceeding.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h> /* opendir */
#include <dirent.h> /* opendir, readdir */
#include <limits.h> /* for PATH_MAX */
#define NPTRS 8 /* initial number of pointers to allocate */
/** qsort string comparison (sort ascending) */
int cmpstr (const void *a, const void *b)
{
return strcmp (*(char * const *) a, *(char * const *) b);
}
int main (int argc, char **argv) {
char **dlist = NULL, /* ptr-to-ptr-to-char for names */
*dname = argc > 1 ? argv[1] : "."; /* dirname supplied (. default) */
size_t idx = 0, /* current index */
nptrs = NPTRS; /* number of pointers allocated */
struct dirent *de = NULL; /* dirent pointer (readdir) */
DIR *dp = opendir (dname); /* directory pointer (opendir) */
if (!dp) { /* validate directory open for reading */
char errbuf[PATH_MAX] = ""; /* storage for perror message */
sprintf (errbuf, "opendir failed on '%s'", dname);
perror (errbuf);
return EXIT_FAILURE;
}
/* allocate/validate nptrs pointers */
if ((dlist = calloc (nptrs, sizeof *dlist)) == NULL) {
perror ("calloc-nptrs");
return EXIT_FAILURE;
}
while ((de = readdir (dp))) {
/* skip dot files */
if (!strcmp (de->d_name, ".") || !strcmp (de->d_name, ".."))
continue;
/* check if dlist pointer limit reached - realloc */
if (idx == nptrs) { /* alwasy realloc to temporary pointer */
void *tmp = realloc (dlist, nptrs * 2 * sizeof *dlist);
if (!tmp) { /* validate reallocation */
perror ("realloc-dlist");
break; /* break, don't exit, original storage still valid */
}
dlist = tmp; /* assign reallocated block to dlist */
/* (optional) set all newly allocated memory to zero */
memset (dlist + nptrs, 0, nptrs * sizeof *dlist);
nptrs *= 2; /* update the number of allocated pointers */
}
/* allocate storage for name in dlist */
if ((dlist[idx] = malloc (strlen (de->d_name) + 1)) == NULL) {
char errbuf[PATH_MAX] = ""; /* storage for perror message */
sprintf (errbuf, "malloc failed '%s'", de->d_name);
perror (errbuf);
break;
}
strcpy (dlist[idx++], de->d_name); /* copy to new storage at idx */
}
closedir (dp); /* close directory */
/* qsort names stored in dlist */
qsort (dlist, idx, sizeof *dlist, cmpstr);
/* output all file/directory names stored, freeing memory as you go */
printf ("'%s' contains '%zu' files:\n\n", dname, idx);
for (size_t i = 0; i < idx; i++) {
puts (dlist[i]); /* output name */
free (dlist[i]); /* free storage for name */
}
free (dlist); /* free pointers */
return 0;
}
Example Use/Output
$ ./bin/opendir_readdir_dyn_char_basic .
'.' contains '1860' files:
3darrayaddr.c
3darrayalloc.c
3darrayfill.c
BoggleData.txt
DoubleLinkedList-old.c
DoubleLinkedList.c
DoubleLinkedList.diff
InputFile.txt
MonoSound.wav
...
xsplit.sh
xstrncpy.c
zeronotzero.c
Memory Use/Error Check
Note also in any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/opendir_readdir_dyn_char_basic .
==16528== Memcheck, a memory error detector
==16528== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==16528== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==16528== Command: ./bin/opendir_readdir_dyn_char_basic .
==16528==
'.' contains '1860' files:
3darrayaddr.c
3darrayalloc.c
3darrayfill.c
BoggleData.txt
DoubleLinkedList-old.c
DoubleLinkedList.c
DoubleLinkedList.diff
InputFile.txt
MonoSound.wav
...
xsplit.sh
xstrncpy.c
zeronotzero.c
==16528==
==16528== HEAP SUMMARY:
==16528== in use at exit: 0 bytes in 0 blocks
==16528== total heap usage: 1,872 allocs, 1,872 frees, 109,843 bytes allocated
==16528==
==16528== All heap blocks were freed -- no leaks are possible
==16528==
==16528== For counts of detected and suppressed errors, rerun with: -v
==16528== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

Related

How to use Fread and Fwrite when i don't know the amount of data in a binary file

I know the theory of fwrite and fread but I must be making some mistake because I can't make them work. I made a random struct, initialized an array and used fwrite to save it into a binary file. Then I opened the same binary file, used fread and saved what was inside in another array. With the debugger I saw what was inside the second array and it says, for example:
ParkingLot2[0].company= "ffffffffffffffff"
ParkingLot2[0].years=-842150451.
When pasting the code I removed all the stuff that made the code too long like if (f==NULL) for controlling the opening of the file and the pointer==NULL after the mallocs and the control that fread and fwrite read the right amount of data.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct Cars {
int years;
char company[16];
}Tcars;
void WriteBinaryFile(char* file_name, Tcars* p_1)
{
FILE* f;
f = fopen(file_name, "wb");
fwrite(p_1, sizeof(Tcars), 3, f);
fclose(f);
return;
}
Tcars* ReadBinaryFile(char* file_name, Tcars* p_2, int* pc)
{
FILE* f;
Tcars temp;
size_t number = 1;
f = fopen("cars.dat", "rb");
while (number) {
number = fread(&temp, sizeof(Tcars), 1, f);
if (number)
(*pc)++;
}
/*i already know that the size is 3 but i want to try this method because in my last exam i
was given a .dat file from my professor and i didn't know how much data i had to read through */
if ((*pc) != 0)
{
p_2 = malloc(sizeof(Tcars) * (*pc));
fread(p_2, sizeof(Tcars), (*pc), f);
}
fclose(f);
return p_2;
}
int main()
{
Tcars* ParkingLot1 = malloc(sizeof(Tcars) * 3);
for(int i=0;i<3;i++)
{
ParkingLot1[i].years = 2000 + i;
}
strcpy(ParkingLot1[0].company, "Fiat");
strcpy(ParkingLot1[1].company, "Ford");
strcpy(ParkingLot1[2].company,"Toyota");
Tcars* ParkingLot2 = NULL;
int cars_amount = 0;
WriteBinaryFile("cars.dat", ParkingLot1);
ParkingLot2 = ReadBinaryFile("cars.dat", ParkingLot2, &cars_amount);
free(ParkingLot1);
free(ParkingLot2);
return 0;
}
You have a number of small (and some not so small errors) that are causing you problems. Your primary problem is passing Tcars* p_2 to ReadBinaryFile() and allocating with p_2 = malloc(sizeof(Tcars) * (*pc)); each time. Why?
Each call to malloc() returns a new block of memory with a new and different address. You overwrite the address of p_2 with each new call, creating a memory leak and losing the data stored prior to the last call. Instead, you need to realloc() to reallocate a larger block of memory and copy your existing data to the new larger block so you can add the next Tcars worth of data at the end of the reallocated block.
If you are reading data from one file, then there is no need to pass p_2 as a parameter to begin with. Simply declare a new pointer in ReadBinaryFile() and realloc() and add each Tcars worth of data at the end and then return the newly allocated pointer. (you must validate every allocation and reallocation)
Your choice of void for WriteBinaryFile() will conceal any error encountered creating or writing to your new file. You must validate EVERY input/output file operation, especially if the data written will be used later in your program. A simple choice of return of type int returning 0 for failure and 1 for success (or vice-versa, up to you) is all you need. That way you can handle any error during file creation or writing and not blindly assume success.
A slightly more subtle issue/error is your allocation for ParkingLot1 using malloc(). Ideally you would use calloc() or use memset to set all bytes zero. Why? You write the entire Tcars struct to the file. malloc() does not initialize the memory allocated. That means all characters in the company name between the end of the name (the nul-terminating character) and the end of the 16 bytes of storage will be uninitialized. While that will not cause problems with your read or write, it is far better to ensure all data being written to your file is initialized data. Otherwise examining the contents of the file will show blocks of uninitialized values written to the file.
Another small style issue is the '*' in the declaration of a pointer generally goes with the variable and not the type. Why?
Tcars* a, b, c;
The declaration above most certainly does not declare 3-pointers of type Tcars, instead it declares pointer a and two struct of type Tcars with automatic storage duration b, and c. Writing:
Tcars *a, b, c;
makes that clear.
Lastly, don't use MagicNumbers or hardcode filenames in your functions. You shouldn't need to recompile your code just to read or write a different filename. It's fine to use "cars.dat" as a default filename in main(), but either take the filenames as the 1st argument to your program (that's what int argc, char **argv parameters to main() are for) or prompt the user for a filename and take it as input. 3 and 16 are MagicNumbers. If you need a constant, #define them or use a global enum.
Putting it altogether, you could do something similar to the following to read an unknown number of Tcars from your data file:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NCARS 3 /* if you need a constant, #define one (or more) */
#define COMPANY 16
typedef struct Cars {
int years;
char company[COMPANY];
} Tcars;
/* returns 1 on success 0 on error */
int WriteBinaryFile (char *file_name, Tcars *p_1, size_t nelem)
{
FILE *f;
size_t size = sizeof *p_1;
f = fopen (file_name, "wb");
if (!f) { /* validate file open for writing */
perror ("fopen-file_name-write");
return 0;
}
/* validate that nelem blocks of size are written to file */
if (fwrite (p_1, size, nelem, f) != nelem) {
return 0;
}
if (fclose (f)) { /* validate close-after-write */
perror ("fclose-f");
return 0;
}
return 1;
}
/* returns pointer to allocated block holding ncars cars on success,
* NULL on failure to read any cars from file_name.
*/
Tcars *ReadBinaryFile (char *file_name, size_t *ncars)
{
FILE *f;
Tcars *tcars = NULL, temp;
size_t nelem = *ncars, size = sizeof (Tcars);
f = fopen (file_name, "rb");
if (!f) {
perror ("fopen-file_name-read");
return NULL;
}
while (fread (&temp, size, 1, f) == 1) {
/* always realloc to a temporary pointer */
void *tempptr = realloc (tcars, (nelem + 1) * size);
if (!tempptr) { /* validate realloc succeeds or handle error */
perror ("realloc-tcars");
break;
}
tcars = tempptr; /* assign reallocated block */
memcpy (tcars + nelem, &temp, size); /* copy new car to end of block */
nelem += 1;
}
fclose (f);
*ncars = nelem;
return tcars;
}
/* void is fine for print functions with no bearing on the
* continued operation of your code.
*/
void prn_cars (Tcars *cars, size_t nelem)
{
for (size_t i = 0; i < nelem; i++) {
printf ("%4d %s\n", cars[i].years, cars[i].company);
}
}
int main (int argc, char **argv)
{
/* read from filename provided as 1st argument ("cars.dat" by default) */
char *filename = argc > 1 ? argv[1] : "cars.dat";
/* must use calloc() on ParkingLot1 or zero memory to avoid writing
* unintialized characters (rest of company) to file.
*/
Tcars *ParkingLot1 = calloc (NCARS, sizeof(Tcars)),
*ParkingLot2 = NULL;
size_t cars_amount = 0;
if (!ParkingLot1) { /* validate EVERY allocation */
perror ("calloc-ParkingLot1");
return 1;
}
for (int i = 0; i < NCARS; i++) {
ParkingLot1[i].years = 2000 + i;
}
strcpy (ParkingLot1[0].company, "Fiat");
strcpy (ParkingLot1[1].company, "Ford");
strcpy (ParkingLot1[2].company,"Toyota");
/* validate WriteBinaryFile succeeds or handle error */
if (!WriteBinaryFile (filename, ParkingLot1, NCARS)) {
return 1;
}
ParkingLot2 = ReadBinaryFile (filename, &cars_amount);
if (ParkingLot2) { /* validate ReadBinaryFile succeeds or handle error */
prn_cars (ParkingLot2, cars_amount); /* output cars read from file */
free (ParkingLot2); /* free if ParkingLot2 not NULL */
}
free(ParkingLot1); /* free ParkingLot1 */
}
(note: you always check the return of fclose() after-a-write to catch any file errors and error flushing the data to the file that can't be caught at the time of the fwrite() call)
Also note in the man page for fread and fwrite they can read or write less than the number of byte (or elements) you request. A short-read may or may not represent an error or premature end-of-file and you need to call ferror() and feof() to determine which, if any, occurred. While direct file reads from disk are not as prone to short-reads as network reads and writes, a full implementation would protect against a short-read regardless of where the data is being read from or written to. Further investigation is left to you.
Example Use/Output
$ ./fwrite_fread_cars dat/cars.dat
2000 Fiat
2001 Ford
2002 Toyota
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./fwrite_fread_cars dat/cars.dat
==7237== Memcheck, a memory error detector
==7237== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==7237== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==7237== Command: ./fwrite_fread_cars dat/cars.dat
==7237==
2000 Fiat
2001 Ford
2002 Toyota
==7237==
==7237== HEAP SUMMARY:
==7237== in use at exit: 0 bytes in 0 blocks
==7237== total heap usage: 9 allocs, 9 frees, 10,340 bytes allocated
==7237==
==7237== All heap blocks were freed -- no leaks are possible
==7237==
==7237== For lists of detected and suppressed errors, rerun with: -s
==7237== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have any question.

Issue with the strtok function in C: it only returns one token

Can someone please help me figure out why my code is not assigning the next element the strtok function returns to my index? It attaches the first return of strtok, but not any of the following ones. So my while loop only runs once. I am not sure what is going on, much help appreciated. Thanks!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int ac, char **av, char **env)
{
char **token;
const char *deli = ":";
char *path = "PATH=";
char *hold;
int i, j, k = 0, inputSize = 100;
int count;
token = malloc(inputSize * sizeof(char));
if(token == NULL)
{
exit;
}
for (count = 0; count < inputSize; count++)
{
token[count] = malloc(sizeof(char) * (inputSize));
if (token[count] == NULL)
{
for (count -= 1; count >= 0; count--)
{
free(token[count]);
}
free(token);
return (0);
}
}
//this loop gets PATH
for (i = 0; env[i] != NULL; i++)
{
for (j = 0; j < 5; j++)
{
if (path[j] != env[i][j])
break;
}
if (j == 5)
break;
}
strtok(env[i], deli);
hold = strtok(env[i], deli);
while (hold != NULL)
{
token[k] = (char*)hold;
printf("%s\n", token[k]);
k++;
hold = strtok(NULL, deli);
}
return (0);
}
You have several issues causing problems. When using strtok() the first call makes used of the pointer itself, e.g. strtok (p, delims); while all subsequent call use NULL in place of the pointer, e.g. strtok (NULL, delim);.
strtok() modifies the string it operates on. You may want to make a copy of the environment string holding the PATH before altering the original.
You are also complicating locating the PATH environment string and then removing the "PATH=" portion so the remainder can be tokenized into the individual path components. A simply want to approach this is to use strncmp() to search for the prefix "PATH=" and then advance past the first five characters in the string.
You can do that simply with PREFIX defined as "PATH=" and prefixlen = strlen(PREFIX); as:
char path[PATH_MAX] = "", /* storage for copy of environment string */
*p = path, /* pointer - general use */
**tokens = NULL; /* pointer to pointer for path components */
...
for (int i = 0; env[i]; i++) /* find PREFIX in environment */
if (strncmp (env[i], PREFIX, prefixlen) == 0) {
strcpy (p, env[i] + prefixlen); /* copy path from env to path */
break;
}
If you initialize your the storage for path to the empty-string or all zero, you can confirm the PATH environment variable was found by testing your copy contains a string, e.g.
if (!*p) { /* if PREFIX not found */
fprintf (stderr, "error: '%s' not found in environment.\n", PREFIX);
return 1;
}
When you use strtok() to separate the path components in path for access through the pointer-to-pointer-to char, you must allocated a pointer for each component (and an additional pointer if you want to provide a sentinel NULL as the final pointer marking the end). While you generally would want to allocate a block of pointers, with the relative few path components (less than 100), you can simply realloc and add a pointer for each token found. When you realloc, you always use a temporary pointer to prevent overwriting your original pointer with NULL if realloc fails.
You also need to allocate storage for each of the path components ensuring storage for the string (+1) for the nul-terminating character. You can use a simple for() loop with strtok() to tokenize your copy of the environment string similar to:
/* loop moving ep (end-pointer) to next DELIM in p or last token */
for (p = strtok (p, DELIM); p; p = strtok (NULL, DELIM)) {
size_t len = strlen(p); /* length of token */
/* realloc pointers using temporary pointer
* adding two more than index to set sentinel NULL as last pointer
* (sentinel NULL is optional, just add 1 if not wanted)
*/
void *tmp = realloc (tokens, (ndx + 2) * sizeof *tokens);
if (!tmp) { /* validate realloc */
perror ("realloc-tokens");
break;
}
tokens = tmp; /* assign new block to tokens */
tokens[ndx + 1] = NULL; /* set sentinel NULL */
if (!(tokens[ndx] = malloc (len + 1))) { /* allocate/validate string storage */
perror ("malloc-tokens[ndx]");
tokens[ndx] = NULL;
break;
}
memcpy (tokens[ndx++], p, len + 1); /* copy path component */
}
Putting it altogether in a short example you could do:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#ifndef PATH_MAX
#define PATH_MAX 4096
#endif
#define PREFIX "PATH=" /* constants for PREFIX and DELIM */
#define DELIM ":\n"
int main (int argc, char **argv, char **env)
{
char path[PATH_MAX] = "", /* storage for copy of environment string */
*p = path, /* pointer - general use */
**tokens = NULL; /* pointer to pointer for path components */
size_t prefixlen = strlen (PREFIX), /* prefix length */
ndx = 0; /* path tokens index */
for (int i = 0; env[i]; i++) /* find PREFIX in environment */
if (strncmp (env[i], PREFIX, prefixlen) == 0) {
strcpy (p, env[i] + prefixlen); /* copy path from env to path */
break;
}
if (!*p) { /* if PREFIX not found */
fprintf (stderr, "error: '%s' not found in environment.\n", PREFIX);
return 1;
}
puts ("\nfull path:\n"); /* show full path to create tokens from */
puts (p);
/* loop moving ep (end-pointer) to next DELIM in p or last token */
for (p = strtok (p, DELIM); p; p = strtok (NULL, DELIM)) {
size_t len = strlen(p); /* length of token */
/* realloc pointers using temporary pointer
* adding two more than index to set sentinel NULL as last pointer
* (sentinel NULL is optional, just add 1 if not wanted)
*/
void *tmp = realloc (tokens, (ndx + 2) * sizeof *tokens);
if (!tmp) { /* validate realloc */
perror ("realloc-tokens");
break;
}
tokens = tmp; /* assign new block to tokens */
tokens[ndx + 1] = NULL; /* set sentinel NULL */
if (!(tokens[ndx] = malloc (len + 1))) { /* allocate/validate string storage */
perror ("malloc-tokens[ndx]");
tokens[ndx] = NULL;
break;
}
memcpy (tokens[ndx++], p, len + 1); /* copy path component */
}
puts ("\npath components:\n"); /* output separated tokens */
for (size_t i = 0; tokens && tokens[i]; i++) { /* loop until sentinel NULL */
puts (tokens[i]);
free (tokens[i]); /* free string storage */
}
free (tokens); /* free pointers */
(void)argc; /* cast to prevent -Wunused warnings */
(void)argv;
}
(note: the for() loop condition tokens && tokens[i] will handle the case where the initial allocation with realloc() returns NULL avoiding the dereference of tokens[0]. You can provide a separate if (!tokens) { perror ("initial allocation failed"); return 1; } above the output loop if you prefer)
Example Use/Output
Compiling and running the program would result in output similar to:
$ ./bin/path_tokens
full path:
/opt/kde3/bin:/home/david/bin:/usr/local/bin:/usr/bin:/bin:/usr/lib/qt3/bin:/sbin:/usr/sbin:/usr/local/sbin:/opt/gcc-arm-none-eabi/bin
path components:
/opt/kde3/bin
/home/david/bin
/usr/local/bin
/usr/bin
/bin
/usr/lib/qt3/bin
/sbin
/usr/sbin
/usr/local/sbin
/opt/gcc-arm-none-eabi/bin
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/path_tokens
==22643== Memcheck, a memory error detector
==22643== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==22643== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==22643== Command: ./bin/path_tokens
==22643==
full path:
/opt/kde3/bin:/home/david/bin:/usr/local/bin:/usr/bin:/bin:/usr/lib/qt3/bin:/sbin:/usr/sbin:/usr/local/sbin:/opt/gcc-arm-none-eabi/bin
path components:
/opt/kde3/bin
/home/david/bin
/usr/local/bin
/usr/bin
/bin
/usr/lib/qt3/bin
/sbin
/usr/sbin
/usr/local/sbin
/opt/gcc-arm-none-eabi/bin
==22643==
==22643== HEAP SUMMARY:
==22643== in use at exit: 0 bytes in 0 blocks
==22643== total heap usage: 21 allocs, 21 frees, 1,679 bytes allocated
==22643==
==22643== All heap blocks were freed -- no leaks are possible
==22643==
==22643== For counts of detected and suppressed errors, rerun with: -v
==22643== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Changes To Work On Windows and Linux
Since windows uses the ';' (semi-colon) for the path separator and prefixes the environment string with "Path=", while Linux uses ':' (colon) for the separator and "PATH=" as the prefix, with a simple preprocessor #if .. #else .. #endif, you can modify the program to work on both Widows and Linux. Simply replace the defines for PREFIX and DELIM with:
#if defined (_WIN64) || defined (_WIN32)
#define PREFIX "Path=" /* constants for PREFIX and DELIM on windows */
#define DELIM ";\r\n"
#else
#define PREFIX "PATH=" /* constants for PREFIX and DELIM on Linux */
#define DELIM ":\n"
#endif
Example Use/Output on Windows
The full path has been omitted since it would scroll for a thousand plus characters:
>bin\path_tokens_win.exe
full path:
<snipped -- too long>
path components:
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX86\x86
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\VC\VCPackages
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\MSBuild\15.0\bin\Roslyn
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Team Tools\Performance Tools
C:\Program Files (x86)\Microsoft Visual Studio\Shared\Common\VSPerfCollectionTools\
C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.6.1 Tools\
C:\Program Files (x86)\Windows Kits\10\bin\10.0.17763.0\x86
C:\Program Files (x86)\Windows Kits\10\bin\x86
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\\MSBuild\15.0\bin
C:\Windows\Microsoft.NET\Framework\v4.0.30319
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\Tools\
C:\Program Files\ImageMagick-7.0.10-Q16
C:\Program Files (x86)\Common Files\Oracle\Java\javapath
C:\ProgramData\Oracle\Java\javapath
C:\Program Files\ImageMagick-7.0.3-Q16
C:\Windows\System32
C:\Windows
C:\Windows\System32\wbem
C:\Windows\System32\WindowsPowerShell\v1.0\
C:\Program Files (x86)\PDFtk\bin\
C:\Program Files (x86)\PDFtk Server\bin\
C:\Program Files (x86)\GNU\GnuPG\pub
C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\
C:\Windows\System32\OpenSSH\
C:\Program Files\Git\cmd
C:\Program Files\PuTTY\
C:\Program Files (x86)\Gpg4win\..\GnuPG\bin
C:\WINDOWS\system32
C:\WINDOWS
C:\WINDOWS\System32\Wbem
C:\WINDOWS\System32\WindowsPowerShell\v1.0\
C:\WINDOWS\System32\OpenSSH\
C:\Users\david\AppData\Local\Microsoft\WindowsApps
c:\MinGW\bin
c:\MinGW\msys\1.0\bin
c:\gtk2\bin
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja
(Microsoft certainly isn't shy about path size....)
There are several other options aside from strtok() that do not alter the original string. You can use strpbrk() similar to how you use strtok() manually advancing a start and end pointer to bracket and copy each component. You can use strchr() to locate each delimiter in essentially the same way. If you like working with indexes more than pointers, you can use strcspn() to get the number of characters between each delimiter. Of course you can also just use a loop working down then string manually locating each delimiter. How you do it is entirely up to you.
Look things over and let me know if you have further questins.

How to take text file as command line argument in C

Bascially, Im trying to accept a command line text file so that when I run the program as "program instructions.txt", it stores the values listed in instructions. However I am having trouble testing what i currently have, as it is giving me the error "segmentation fault core dumped".
int main(int argc, char* argv[]) {
setup_screen();
setup();
// File input
char textExtracted[250];
FILE* file_handle;
file_handle = fopen(argv[1], "r");
while(!feof(file_handle)){
fgets(textExtracted, 150, file_handle);
}
fclose(file_handle);
printf("%s", textExtracted[0]);
return 0;
}
Inside the text file is
A 1 2 3 4 5
B 0 0
C 1 1
Im just trying to store each line in an array and then print them.
Some points:
int main(int argc, char* argv[]) {
I suggest you check number of arguments here before proceeding further
setup_screen();
setup();
// File input
char textExtracted[250];
Declaration can be joined but always always check return values from I/O
FILE* file_handle = = fopen(argv[1], "r");
if (NULL == file_handle)
{
perror(argv[1]);
return EXIT_FAILURE;
}
below is not the correct way to read from a file, instead you should
try and read from the file first, then check for error/eof/enuff bytes read
// while(fgets(textExtracted,sizeof(textExtracted), 1, file_handle) > 0) {}
while(!feof(file_handle)){
fgets(textExtracted, 150, file_handle);
}
It looks like you think fgets appends to textExtracted when you call it
multiple times, it doesn't! every line in the file will overwrite the previously read line. Note also that the \n character is included in your buffer.
But since your file appears to be pretty small, you could read the whole
content into your buffer and work with that.
// int size = fread(textExtracted, sizeof(textExtracted), 1, file_handle);
Better is to check the size of the file first and then allocate a buffer with malloc to hold the whole file or read the file character by character and do whatever commands you need to do on the fly. e.g. a switch statement is excellent as a statemachine
switch( myreadchar )
{
case 'A':
break;
case 'B':
break;
...
}
textExtracted[0] is one character, textExtracted is the whole array so instead of
printf("%s", textExtracted[0]);
write
printf("%s", textExtracted);
or even better
fputs(textExtracted, stdout);
return 0;
The problem you present is the classic problem of "How do I read an unknown number of lines of unknown length from a file?" The way you approach the problem in a memory efficient manner is to declare a pointer-to-pointer to char and allocate a reasonable number of pointers to begin with and then allocate for each line assigning the starting address for the block holding each line to your pointers as you read each line and allocate for it.
An efficient way to do that is to read each line from a file into a fixed buffer (of size sufficient to hold your longest line without skimping) with fgets or by using POSIX getline which will allocate as needed to hold the line. You then remove the trailing '\n' from your temporary buffer and obtain the length of the line.
Then allocate a block of memory of length + 1 characters (the +1 for the nul-terminating character) and assign the address for your new block of memory to your next available pointer (keeping track of the number of pointers allocated and the number of pointers used)
When the number of pointers used equals the number allocated, you simply realloc additional pointers (generally by doubling the current number available, or by allocating for some fixed number of additional pointers) and you keep going. Repeat the process as many times as needed until you have read all of the lines in your input file.
There are a number of ways to implement it and arrange the differing tasks, but all basically boil down to the same thing. Start with a temporary buffer of reasonable size to handle your longest line (without skimping, just in case there is some variation in your input data -- a 1K buffer is cheap insurance, adjust as needed). Add your counters to keep track of the number of pointers allocated and then number used (your index). Open and validate the file given on the command line is open for reading (or read from stdin by default if no argument was given on the command line) For example you could do:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
int main (int argc, char **argv) {
char buf[MAXC] = "", /* temp buffer to hold line read from file */
**lines = NULL; /* pointer-to-pointer to each line */
size_t ndx = 0, alloced = 0; /* current index, no. of ptrs allocated */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
...
With your file open and validated, you are ready to read each line, controlling your read loop with your read function itself, and following the outline above to handle storage for each line, e.g.
while (fgets (buf, MAXC, fp)) { /* read each line */
size_t len; /* for line length */
if (ndx == alloced) { /* check if realloc needed */
void *tmp = realloc (lines, /* alloc 2X current, or 2 1st time */
(alloced ? 2 * alloced : 2) * sizeof *lines);
if (!tmp) { /* validate EVERY allocation */
perror ("realloc-lines");
break; /* if allocation failed, data in lines still good */
}
lines = tmp; /* assign new block of mem to lines */
alloced = alloced ? 2 * alloced : 2; /* update ptrs alloced */
}
Note: above, the first thing that happens in your read loop is to check if you have pointers available, e.g. if (ndx == alloced), if your index (number used) is equal to the number allocated, you reallocate more. The ternary above alloced ? 2 * alloced : 2 simply asks if you have some previously allocated alloced ? then double the number 2 * alloced otherwise (:) just start with 2 pointers and go from there. In that doubling scheme, you allocate 2, 4, 8, 16, ... pointers with each successive reallocation.
Also note: when you call realloc you always use a temporary pointer, e.g. tmp = realloc (lines, ...) and you never realloc using the pointer itself, e.g. lines = realloc (lines, ...). When (not if) realloc fails, it returns NULL, and if you assign that to your original pointer -- you have just created a memory leak because the address for lines has been lost meaning you cannot reach or free() the memory you previously allocated.
Now you have confirmed you have a pointer available to assign the address of a block of memory to hold the line, remove the '\n' from buf and get the length of the line. You can do that conveniently in a single call to strcspn which returns the initial number of characters in the string not containing the delimiter "\n", e.g.
buf[(len = strcspn(buf, "\n"))] = 0; /* trim \n, get length */
(note: above you are simply overwriting the '\n' with the nul-terminating character 0, equivalent to '\0')
Now that you have the length of the line, you simply allocate length + 1 characters and copy from the temporary buffer buf to your new block of memory, e.g.
if (!(lines[ndx] = malloc (len + 1))) { /* allocate for lines[ndx] */
perror ("malloc-lines[ndx]"); /* validate combined above */
break;
}
memcpy (lines[ndx++], buf, len + 1); /* copy buf to lines[ndx] */
} /* increment ndx */
At that point you are done reading and storing all lines and can simply close the file if not reading from stdin. Here, for example, we just output each of the lines, and then free the storage for each line, finally freeing the memory for the allocated pointers as well, e.g.
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < ndx; i++) { /* loop over each storage line */
printf ("lines[%2zu] : %s\n", i, lines[i]); /* output line */
free (lines[i]); /* free storage for strings */
}
free (lines); /* free pointers */
}
That's it. Putting it altogether, you could do:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
int main (int argc, char **argv) {
char buf[MAXC] = "", /* temp buffer to hold line read from file */
**lines = NULL; /* pointer-to-pointer to each line */
size_t ndx = 0, alloced = 0; /* current index, no. of ptrs allocated */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (fgets (buf, MAXC, fp)) { /* read each line */
size_t len; /* for line length */
if (ndx == alloced) { /* check if realloc needed */
void *tmp = realloc (lines, /* alloc 2X current, or 2 1st time */
(alloced ? 2 * alloced : 2) * sizeof *lines);
if (!tmp) { /* validate EVERY allocation */
perror ("realloc-lines");
break; /* if allocation failed, data in lines still good */
}
lines = tmp; /* assign new block of mem to lines */
alloced = alloced ? 2 * alloced : 2; /* update ptrs alloced */
}
buf[(len = strcspn(buf, "\n"))] = 0; /* trim \n, get length */
if (!(lines[ndx] = malloc (len + 1))) { /* allocate for lines[ndx] */
perror ("malloc-lines[ndx]"); /* validate combined above */
break;
}
memcpy (lines[ndx++], buf, len + 1); /* copy buf to lines[ndx] */
} /* increment ndx */
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < ndx; i++) { /* loop over each storage line */
printf ("lines[%2zu] : %s\n", i, lines[i]); /* output line */
free (lines[i]); /* free storage for strings */
}
free (lines); /* free pointers */
}
Example Use/Output
$ ./bin/fgets_lines_dyn dat/cmdlinefile.txt
lines[ 0] : A 1 2 3 4 5
lines[ 1] : B 0 0
lines[ 2] : C 1 1
Redirecting from stdin instead of opening the file:
$ ./bin/fgets_lines_dyn < dat/cmdlinefile.txt
lines[ 0] : A 1 2 3 4 5
lines[ 1] : B 0 0
lines[ 2] : C 1 1
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/fgets_lines_dyn dat/cmdlinefile.txt
==6852== Memcheck, a memory error detector
==6852== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==6852== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==6852== Command: ./bin/fgets_lines_dyn dat/cmdlinefile.txt
==6852==
lines[ 0] : A 1 2 3 4 5
lines[ 1] : B 0 0
lines[ 2] : C 1 1
==6852==
==6852== HEAP SUMMARY:
==6852== in use at exit: 0 bytes in 0 blocks
==6852== total heap usage: 6 allocs, 6 frees, 624 bytes allocated
==6852==
==6852== All heap blocks were freed -- no leaks are possible
==6852==
==6852== For counts of detected and suppressed errors, rerun with: -v
==6852== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
While allocating storage for each pointer and each line may look daunting at first, it is a problem you will face over and over again, whether reading and storing lines from a file, integer or floating point values in some 2D representation of the data, etc... it is worth the time it takes to learn. Your alternative is to declare a fixed size 2D array and hope your line length never exceeds the declared width and the number of lines never exceeds your declared number of rows. (you should learn that as well, but the limitation become quickly apparent)
Look things over and let me know if you have further questions.

Is there a way to stop fread reading characters into an array in a struct following a whitespace?

I'm attempting to do homework for my second-semester programming class in which we have to read data from a file like this:
Fred 23 2.99
Lisa 31 6.99
Sue 27 4.45
Bobby 456 18.844
Ann 7 3.45
using structs in fread. I'll eventually have to create a loop to read all of the data then convert it to binary and write it to a file but this is as far as I've gotten before running into a problem:
struct data
{
char name[25];
int iNum;
float fNum;
};
int main(int argc, char *argv[])
{
struct data mov;
FILE *fp;
fp = fopen(argv[1], "r");
fread(&mov, sizeof(struct data), 1, fp);
printf(" name: %s\n int: %d\n float: %f\n", mov.name, mov.iNum, mov.fNum);
return 0;
}
The problem I'm having is that fread will read the first 25 characters into the array instead of stopping at the first whitespace, so it produces output like this:
name: Fred 23 2.99
Lisa 31 6.99
Sue 27 4.4
int: 926031973
float: 0.000000
instead of the desired result, which would be something more like:
name: Fred
int: 23
float: 2.99000
From what I've read, I believe this is how fread is supposed to function, and I'm sure there's a better way of going about this problem, but the assignment requires we use fread and a 25 character array in our struct. What's the best way to go about this?
Is there a way to stop fread reading characters into an array in a
struct following a whitespace?
Answer: Yes (but not with fread directly, you'll need a buffer to accomplish the task)
The requirement you use fread to parse formatted-text from an input file is certainly an academic exercise (a good one at that), but not something you would normally do. Why? Normally when you are concerned with reading lines of data from a file, you use a line-oriented input function such as fgets() or POSIX getline().
You could also use the character-oriented input function fgetc() and simply read from the file, buffering the input, until the '\n' is found, do what you need with the buffer and repeat. Your last normal option (but discouraged due to it fragility) is to use a formatted-input function like fscanf() -- but its misuse accounts for a significant percentage of questions on this site.
But, if for an academic challenge, you must use fread(), then as mentioned in the comments you will want to read the entire file into an allocated buffer, and then parse that buffer as if you were reading it a line-at-a-time from the actual file. sscanf would be used if reading with fgets() and it can be used here to read from the buffer filled with fread(). The only trick is keeping track of where you are within the buffer to start each read -- and knowing where to stop.
With that outline, how do you approach reading an entire file into a buffer with fread()? You first need to obtain the file length to know how much space to allocate. You do that either by calling stat or fstat and utilizing the st_size member of the filled struct stat containing the filesize, or you use fseek to move to the end of the file and use ftell() to report the offset in bytes from the beginning.
A simple function that takes an open FILE* pointer, saves the current position, moves the file-position indicator to the end, obtains the file-size with ftell() and then restores the file-position indicator to its original position could be:
/* get the file size of file pointed to by fp */
long getfilesize (FILE *fp)
{
fpos_t currentpos;
long bytes;
if (fgetpos (fp, &currentpos) == -1) { /* save current file position */
perror ("fgetpos");
return -1;
}
if (fseek (fp, 0, SEEK_END) == -1) { /* fseek end of file */
perror ("fseek-SEEK_END");
return -1;
}
if ((bytes = ftell (fp)) == -1) { /* get number of bytes */
perror ("ftell-bytes");
return -1;
}
if (fsetpos (fp, &currentpos) == -1) { /* restore file positon */
perror ("fseek-SEEK_SET");
return -1;
}
return bytes; /* return number of bytes in file */
}
(note: above each step is validated and -1 is returned on error, otherwise the file-size is returned on success. Make sure you validate each step in your program and always provide a meaningful return from your functions that can indicate success/failure.)
With the file-size in hand, all you need to do before calling fread() is to allocate a block of memory large enough to hold the contents of the file and assign the starting address to that block of memory to a pointer that can be used with fread(). For example:
long bytes; /* size of file in bytes */
char *filebuf, *p; /* buffer for file and pointer to it */
...
if ((bytes = getfilesize (fp)) == -1) /* get file size in bytes */
return 1;
if (!(filebuf = malloc (bytes + 1))) { /* allocate/validate storage */
perror ("malloc-filebuf");
return 1;
}
(we will talk about the + 1 later on)
Now you have adequate storage for your file and the address for the storage is assigned to the pointer filebuf, you can call fread() and read the entire file into that block of memory with:
/* read entire file into allocated memory */
if (fread (filebuf, sizeof *filebuf, bytes, fp) != (size_t)bytes) {
perror ("fread-filebuf");
return 1;
}
Now your entire file is stored in the block of memory pointed to by filebuf. How do you parse the data line-by-line into your struct (or actually an array of struct so each record is stored within a separate struct)? It's actually pretty easy. You just read from the buffer and keep track of the number of characters used to read up until a '\n' is found, parsing the information in that line into a struct element of the array, add the offset to your pointer to prepare for the next read and increment the index on your array of struct to account for the struct you just filled. You are essentially using sscanf just as you would if you read the line from the file with fgets(), but you are manually keeping track of the offset within the buffer for the next call to sscanf, e.g.
#define NDATA 16 /* if you need a constant, #define one (or more) */
#define MAXC 25
struct data { /* your struct with fixed array of 25-bytes for name */
char name[MAXC];
int iNum;
float fNum;
};
...
struct data arr[NDATA] = {{ .name = "" }}; /* array of struct data */
int used; /* no. chars used by sscanf */
size_t ndx = 0, offset = 0; /* array index, and pointer offset */
...
filebuf[bytes] = 0; /* trick - nul-terminate buffer for sscanf use */
p = filebuf; /* set pointer to filebuf */
while (ndx < NDATA && /* while space in array */
sscanf (p + offset, "%24s %d %f%n", /* parse values into struct */
arr[ndx].name, &arr[ndx].iNum, &arr[ndx].fNum, &used) == 3) {
offset += used; /* update offset with used chars */
ndx++; /* increment array index */
}
That's pretty much it. You can free (filebuf); now that you are done with it and all the values are now stored in your array of struct arr.
There is one important line of code above we have not talked about -- and I told you we would get to it later. It is also something you wouldn't normally do, but it mandatory where you are going to process the buffer as text with sscanf, a function normally used to process strings. How will you ensure sscanf knows were to stop reading and doesn't continue reading beyond the bounds of filebuf?
filebuf[bytes] = 0; /* trick - nul-terminate buffer for sscanf use */
That's where the + 1 on the allocated size comes into play. You don't normally terminate a buffer -- there is no need. However, if you want to process the contents of the buffer with functions normally used to process strings -- then you do. Otherwise, sscanf will continue to read past the final '\n' in the buffer off into memory you cannot validly access until it finds a random 0 somewhere in the heap. (with the potential of filling additional additional structs with garbage if they happen to satisfy the format-string)
Putting it altogether, you could do:
#include <stdio.h>
#include <stdlib.h>
#define NDATA 16 /* if you need a constant, #define one (or more) */
#define MAXC 25
struct data { /* your struct with fixed array of 25-bytes for name */
char name[MAXC];
int iNum;
float fNum;
};
long getfilesize (FILE *fp); /* function prototype for funciton below */
int main (int argc, char **argv) {
struct data arr[NDATA] = {{ .name = "" }}; /* array of struct data */
int used; /* no. chars used by sscanf */
long bytes; /* size of file in bytes */
char *filebuf, *p; /* buffer for file and pointer to it */
size_t ndx = 0, offset = 0; /* array index, and pointer offset */
FILE *fp; /* file pointer */
if (argc < 2) { /* validate at least 1-arg given for filename */
fprintf (stderr, "error: insufficient arguments\n"
"usage: %s <filename>\n", argv[0]);
return 1;
}
/* open file / validate file open for reading */
if (!(fp = fopen (argv[1], "rb"))) {
perror ("file open failed");
return 1;
}
if ((bytes = getfilesize (fp)) == -1) /* get file size in bytes */
return 1;
if (!(filebuf = malloc (bytes + 1))) { /* allocate/validate storage */
perror ("malloc-filebuf");
return 1;
}
/* read entire file into allocated memory */
if (fread (filebuf, sizeof *filebuf, bytes, fp) != (size_t)bytes) {
perror ("fread-filebuf");
return 1;
}
fclose (fp); /* close file, read done */
filebuf[bytes] = 0; /* trick - nul-terminate buffer for sscanf use */
p = filebuf; /* set pointer to filebuf */
while (ndx < NDATA && /* while space in array */
sscanf (p + offset, "%24s %d %f%n", /* parse values into struct */
arr[ndx].name, &arr[ndx].iNum, &arr[ndx].fNum, &used) == 3) {
offset += used; /* update offset with used chars */
ndx++; /* increment array index */
}
free (filebuf); /* free allocated memory, values stored in array */
for (size_t i = 0; i < ndx; i++) /* output stored values */
printf ("%-24s %4d %7.3f\n", arr[i].name, arr[i].iNum, arr[i].fNum);
return 0;
}
/* get the file size of file pointed to by fp */
long getfilesize (FILE *fp)
{
fpos_t currentpos;
long bytes;
if (fgetpos (fp, &currentpos) == -1) { /* save current file position */
perror ("fgetpos");
return -1;
}
if (fseek (fp, 0, SEEK_END) == -1) { /* fseek end of file */
perror ("fseek-SEEK_END");
return -1;
}
if ((bytes = ftell (fp)) == -1) { /* get number of bytes */
perror ("ftell-bytes");
return -1;
}
if (fsetpos (fp, &currentpos) == -1) { /* restore file positon */
perror ("fseek-SEEK_SET");
return -1;
}
return bytes; /* return number of bytes in file */
}
(note: approximately 1/2 the lines of code are devoted to validating each step. That is normal and critical to ensure you don't invoke Undefined Behavior by blindly continuing forward in your code after a failure occurs that prevents you from processing valid data.)
Example Use/Output
With that you program is complete and you should be able to parse the data from the buffer filled by fread() having stopped at all appropriate times following a whitespace.
$ ./bin/freadinumfnum dat/inumfnum.txt
Fred 23 2.990
Lisa 31 6.990
Sue 27 4.450
Bobby 456 18.844
Ann 7 3.450
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ valgrind ./bin/freadinumfnum dat/inumfnum.txt
==5642== Memcheck, a memory error detector
==5642== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==5642== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==5642== Command: ./bin/freadinumfnum dat/inumfnum.txt
==5642==
Fred 23 2.990
Lisa 31 6.990
Sue 27 4.450
Bobby 456 18.844
Ann 7 3.450
==5642==
==5642== HEAP SUMMARY:
==5642== in use at exit: 0 bytes in 0 blocks
==5642== total heap usage: 2 allocs, 2 frees, 623 bytes allocated
==5642==
==5642== All heap blocks were freed -- no leaks are possible
==5642==
==5642== For counts of detected and suppressed errors, rerun with: -v
==5642== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have further questions.

Pointing to an uninitialised pointer vs Pointing to it after allocating memory for it

This doubt is very specific , consider the below code
The two lines in the block, are the lines I am confused with. ie. when I exchange those two lines I get a segmentation fault , but this code runs.So my question is
what is happening when I interchange the two lines?
#include<stdio.h>
#include<stdlib.h>
typedef struct scale_node_s {
char note[4];
struct scale_node_s *linkp;
} scale_node_t;
int main(){
scale_node_t *scalep, *prevp, *newp,*walker;
int i;
scalep = (scale_node_t *)malloc(sizeof (scale_node_t));
scanf("%s", scalep->note);
prevp = scalep;
for(i = 0; i < 7; ++i) {
//---------------------------------------
newp = (scale_node_t *)malloc(sizeof (scale_node_t));
prevp->linkp = newp;
//---------------------------------------
scanf("%s", newp->note);
prevp = newp;
}
walker = scalep;
for(i = 0 ; i < 7 ; i++){
printf("%s",walker->note);
walker = walker->linkp;
}
}
The line newp = (scale_node_t *)malloc(sizeof (scale_node_t)); allocates a piece of memory needed to hold an instance of scale_node_t and makes newp to hold that address. On the next line, you pass newp to a struct to be the value of linkp.Since on the first run of the loop newp is defined, but its value is not determined, it can hold several values depending on OS (and maybe on compiler too): memory waste, or 0 (so newp even can be null pointer there), segmentation fault occures hence.It is not allowed to use any variable before initialization (pointers are actually variables, holding a memory address as a number), however some editor/environment/compiler may not warn you about it at compile time.
In addition to the other answer, you have several additional issues that are making your list logic very confused and brittle. First, you are hardcoding loop iterations that may or may not match your input. The entire purpose of a linked-list is to provide a flexible data structure that allows you to store an unknown number of nodes. for(i = 0; i < 7; ++i) defeats that purpose entirely.
What is it you want to store? What is your input? (the note strings). Why not condition creation of additional nodes on input of a valid note? It can be as simple as:
char tmp[MAXC] = "";
...
while (scanf ("%3s", tmp) == 1) {
...
strcpy (newp->note, tmp); /* set note and linkp */
...
}
You are also leaving yourself wide open to processing invalid values throughout your code. Why? You fail to validate your user input and memory allocations. If either fail, you continue to blindly use undefined values from the point of failure forward. ALWAYS validate ALL user input and memory allocations. It is simple to do, e.g.
if (!(scalep = malloc (sizeof *scalep))) { /* allocate/validate */
fprintf (stderr, "error: virtual memory exhausted, scalep.\n");
return 1;
}
if (scanf ("%3s", scalep->note) != 1) { /* validate ALL input */
fprintf (stderr, "error: invalid input, scalep->note\n");
return 1;
}
Finally, there is no need for a prevp. That will only be required when removing or swapping nodes (you have to rewire the prev pointer to point to next (your linkp) after you remove or swap a node. You do neither in your code. There is also no need for an int value. You iterate current node = next node; to iterate though your list. (there are several variations of how to do this). Putting all the pieces together, and attempting to lay the code out in a slightly more logical way, you could do something similar to the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXC 4
typedef struct scale_node_s {
char note[MAXC];
struct scale_node_s *linkp;
} scale_node_t;
int main (void)
{
scale_node_t *scalep, *newp, *walker;
char tmp[MAXC] = "";
if (!(scalep = malloc (sizeof *scalep))) { /* allocate/validate */
fprintf (stderr, "error: virtual memory exhausted, scalep.\n");
return 1;
}
if (scanf ("%3s", scalep->note) != 1) { /* validate ALL input */
fprintf (stderr, "error: invalid input, scalep->note\n");
return 1;
}
scalep->linkp = NULL;
while (scanf ("%3s", tmp) == 1) {
if (!(newp = malloc (sizeof *newp))) { /* allocate/validate */
fprintf (stderr, "error: virtual memory exhausted, newp.\n");
break;
}
strcpy (newp->note, tmp); /* set note and linkp */
newp->linkp = NULL;
walker = scalep; /* set walker to scalep */
while (walker->linkp) /* find last node */
walker = walker->linkp; /* linkp !NULL move to next node */
walker->linkp = newp;
}
walker = scalep; /* output list */
while (walker) {
printf ("%s\n", walker->note);
walker = walker->linkp;
}
walker = scalep; /* free list memory */
while (walker) {
scale_node_t *victim = walker; /* save victim address */
walker = walker->linkp;
free (victim); /* free victim */
}
return 0;
}
Example Use/Output
$ echo "a b c d e" | ./bin/llhelp
a
b
c
d
e
Memory Use/Error Check
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
It is imperative that you use a memory error checking program to insure you do not attempt to write beyond/outside the bounds of your allocated block of memory, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
$ echo "a b c d e" | valgrind ./bin/llhelp
==25758== Memcheck, a memory error detector
==25758== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==25758== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==25758== Command: ./bin/llhelp
==25758==
a
b
c
d
e
==25758==
==25758== HEAP SUMMARY:
==25758== in use at exit: 0 bytes in 0 blocks
==25758== total heap usage: 5 allocs, 5 frees, 80 bytes allocated
==25758==
==25758== All heap blocks were freed -- no leaks are possible
==25758==
==25758== For counts of detected and suppressed errors, rerun with: -v
==25758== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Always confirm that you have freed all memory you have allocated and that there are no memory errors.
Look things over and let me know if you have any additional questions.

Resources