Use fscanf to read strings and empty lines

Use fscanf to read strings and empty lines - c

I have a text file containing keywords and integers and have access to the file stream in order to parse this file.
I am able to parse it by doing
while( fscanf(stream, "%s", word) != -1 ) which gets each word and int in the file for me to parse, but the problem I'm having is that I cannot detect an empty line "\n" which then I need to detect for something. I can see that \n is a character thus not detected by %s. What can I do to modify fscanf to also get EOL characters?

You can do exactly what it is you wish to do with fscanf, but the number of checks and validations required to do it properly, and completely is just painful compared to using a proper line oriented input function like fgets.
With fgets (or POSIX getline) detecting an empty line requires nothing special, or in addition to, reading a normal line. For example, to read a line of text with fgets, you simply provide a buffer of sufficient size and make a single call to read up to and including the '\n' into buf:
while (fgets (buf, BUFSZ, fp)) { /* read each line in file */
To check whether the line was an empty-line, you simply check if the first character in buf is the '\n' char, e.g.
if (*buf == '\n')
/* handle blank line */
or, in the normal course of things, you will be removing the trailing '\n' by obtaining the length and overwriting the '\n' with the nul-terminating character. In which case, you can simply check if length is 0 (after removal), e.g.
size_t len = strlen (buf); /* get buf length */
if (len && buf[len-1] == '\n') /* check last char is '\n' */
buf[--len] = 0; /* overwrite with nul-character */
(note: if the last character was not '\n', you know the line was longer than the buffer and characters in the line remain unread -- and will be read on the next call to fgets, or you have reached the end of the file with a non-POSIX line ending on the last line)
Putting it altogether, an example using fgets identifying empty lines, and providing for printing complete lines even if the line exceeds the buffer length, you could do something like the following:
#include <stdio.h>
#include <string.h>
#define BUFSZ 4096
int main (int argc, char **argv) {
size_t n = 1;
char buf[BUFSZ] = "";
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
while (fgets (buf, BUFSZ, fp)) { /* read each line in file */
size_t len = strlen (buf); /* get buf length */
if (len && buf[len-1] == '\n') /* check last char is '\n' */
buf[--len] = 0; /* overwrite with nul-character */
else { /* line too long or non-POSIX file end, handle as required */
printf ("line[%2zu] : %s\n", n, buf);
continue;
} /* output line (or "empty" if line was empty) */
printf ("line[%2zu] : %s\n", n++, len ? buf : "empty");
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
return 0;
}
Example Input File
$ cat ../dat/captnjack2.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
Example Use/Output
$ ./bin/fgetsblankln ../dat/captnjack2.txt
line[ 1] : This is a tale
line[ 2] : empty
line[ 3] : Of Captain Jack Sparrow
line[ 4] : empty
line[ 5] : A Pirate So Brave
line[ 6] : empty
line[ 7] : On the Seven Seas.
So Why Does Everybody Recommend fgets?
Well, let's take a look at doing the same thing with fscanf and I'll let you be the judge. To begin with, fscanf does not read or include the trailing '\n' with the "%s" format specifier (by default) or when using the character class "%[^\n]" (because it was specifically excluded). So you do not have the ability to read a (1) line with characters and (2) line without characters using the same format string. You either read characters and fscanf succeeds, or you don't and you experience a matching failure.
So as alluded to in the comments, you have to pre-check if the next character in the input buffer is a '\n' character using fgetc (or getc) and then put it back in the input buffer with ungetc if it isn't.
Further adding to your fscanf task, you must independently validate each check, put back, and read every step along the way. This results in quite a number of checks to handle all cases and provide all checks necessary to avoid undefined behavior.
As part of those checks you will need to limit the number of characters you read to one less-than the number of characters in the buffer while capturing the next character to determine if the line was too long to fit. Additional checks are required to handle (without failure) a file with a non-POSIX line end on the final line -- something handled without issue by fgets.
Below is a similar implementation to the fgets code above. Go through and understand why each step it necessary and what each validation prevents against. You may be able to rearrange slightly, but it has been whittled down to close to the bare minimum. After going though it, it should become clear why fgets is the preferred method for handling checks for empty lines (as well as for line oriented input, generally)
#include <stdio.h>
#define BUFSZ 4096
int main (int argc, char **argv) {
int c = 0, r = 0;
size_t n = 1;
char buf[BUFSZ] = "", nl = 0;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
for (;;) { /* loop until EOF */
if ((c = fgetc (fp)) == '\n') /* check next char is '\n' */
*buf = 0; /* make buf empty-string */
else {
if (c == EOF) /* check if EOF */
break;
if (ungetc (c, fp) == EOF) { /* ungetc/validate */
fprintf (stderr, "error: ungetc failed.\n");
break;
}
/* read line into buf and '\n' into nl, handle failure */
if ((r = fscanf (fp, "%4095[^\n]%c", buf, &nl)) != 2) {
if (r == EOF) { /* EOF (input failure) */
break;
} /* check next char, if not EOF, non-POSIX eol */
else if ((c = fgetc (fp)) != EOF) {
if (ungetc (c, fp) == EOF) { /* unget it */
fprintf (stderr, "error: ungetc failed.\n");
break;
} /* read line again handling non-POSIX eol */
if (fscanf (fp, "%4095[^\n]", buf) != 1) {
fprintf (stderr, "error: fscanf failed.\n");
break;
}
}
} /* good fscanf, validate nl = '\n' or line to long */
else if (nl != '\n') {
fprintf (stderr, "error: line %zu too long.\n", n);
break;
}
} /* output line (or "empty" for empty line) */
printf ("line[%2zu] : %s\n", n++, *buf ? buf : "empty");
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
return 0;
}
The Use/Output is identical to above. Look things over and let me know if you have any further questions.

Related

Check if all letters of the alphabet appear in a file with no repeats

I'm writing a program that needs to read a CSV file and check that all letters in the alphabet appear one time on each side of the comma. The file would look something like this:
a,x
b,j
c,g
d,l
e,s
f,r
g,u
h,z
i,w
j,c
k,e
l,a
m,v
but there would be 26 lines total. What would be the most efficient way to check that each side has all 26 letters with no repeats?

While it is unclear from your question and follow-up comments where exactly you are stuck, or whether you have thrown in the towel and given up, let's take it from the beginning.
Open Your File (or reading stdin)
Before you can do anything with the content of your file, you need to open you file for reading. For reading formatted-input you will generally use the functions that read and write from a file stream using a FILE * stream pointer (as opposed to the low-level file-descriptor file interface). To open your file, you will call fopen and check the return to validate the open succeeded.
Do not hard-code filenames or numbers in your program. Your program takes arguments, either pass the filename to open as an argument, or prompt for entry of the filename. You can increase the flexibility of your program by taking the filename to read as an argument, or read from stdin by default if no argument is provided (as most Linux utilities do). Since stdin is a file stream, you can simply assign it to your FILE* pointer if you are not opening a filename provided as an argument. For example:
FILE *fp = NULL;
if (argc > 1) /* if one argument provided */
fopen (argv[1], "r"); /* open file with name from argument */
else
fp = stdin; /* set fp to stdin */
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
which can be shortened using the ternary operator, e.g.:
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
Reading Your Data
With a file-stream open and validated, you can now read your data from your file. While you could read with fscanf, you are limited in the information it provides in the event less that two values are read. Additionally, reading with the scanf family of functions is full of pitfalls due to what characters remain in your input file-stream depending on the conversion specifiers used and on whether the conversion succeeded or failed. Nonetheless, a simple approach that validates two conversions took place according to your format-string will allow you to read your file, e.g.
char c1, c2; /* characters from each line */
int freq1[MAXC] = {0}, freq2[MAXC] = {0}; /* frequency arrays */
...
while (fscanf (fp, " %c, %c", &c1, &c2) == 2) /* read all chars */
if (c1 > 0 || c2 > 0) /* validate ASCII values */
/* increment element in each */
freq1[(unsigned char)c1]++, freq2[(unsigned char)c2]++;
(the downside is any variation in format on one line can leave you with unwanted characters stored, and if less than two conversions take place, your read loop stops despite valid data remaining unread)
A better approach is to read a line-at-a-time with a line-oriented input function such as fgets or POSIX getline. With this approach, you are consuming a line of data at a time, and then parsing the needed information from the stored line. The benefits are significant. You have an independent validation of the read itself, and then whether you find the needed values in the line. If your format varies an you parse less than the needed values from the line, you have the option of simply skipping that line and continuing with the next. Further, what remains in your input file stream does not depend on the conversion specifiers used.
An example with fgets and sscanf doing the same thing would be:
char c1, c2, /* characters from each line */
buf[MAXC] = ""; /* buffer to hold each line */
...
while (fgets (buf, MAXC, fp)) /* read all chars */
if (sscanf (buf, " %c, %c", &c1, &c2) == 2) { /* parse values */
if (c1 > 0 || c2 > 0) /* validate ASCII values */
/* increment element in each */
freq1[(unsigned char)c1]++, freq2[(unsigned char)c2]++;
}
else
fputs ("error: in line format.\n", stderr);
Handling the Frequency of Characters
If you have been paying attention to the read of the data from the file, you will note that a pair of frequency arrays have been incremented on each read of the characters freq1 and freq2. As mentioned in my comments above, you start with an adequately sized array of int to hold the ASCII character set. The arrays are initialized to zero. When you read a character from each column, you simply increment the value at:
if (c1 > 0 || c2 > 0) /* validate ASCII values */
/* increment each element */
freq1[(unsigned char)c1]++, freq2[(unsigned char)c2]++;
For example the ASCII value for 'a' is 97 (see ASCII Table and Description). So if you read an 'a' and increment
freq1['a']++;
that is the same as incrementing:
freq1[97]++;
When you are done with your read loop, you simply need to iterate over your frequency arrays from 'a' to 'z' and the number of times the corresponding character appeared in your file will be captured in your array. Then you can use the data however you like.
Outputting The Results
The simplest way to output your column1/column2 results is simply to output the number of occurrences for each character. For example:
for (int i = 'a'; i <= 'z'; i++) /* loop over 'a' to 'z' */
printf (" %c: %d, %d\n", i, freq1[i], freq2[i]);
Which will produce output similar to:
$ ./bin/freq_dual_col2 <dat/char2col.txt
lowercase occurrence:
a: 1, 1
b: 1, 0
c: 1, 1
d: 1, 0
e: 1, 1
f: 1, 0
...
If you wanted to get a little more verbose and note whether the characters appears "none", or 1 or whether the character was duplicated "dupe", you could employ a few additional checks, e.g.
for (int i = 'a'; i <= 'z'; i++) { /* loop over 'a' to 'z' */
if (freq1[i] == 1) /* check col 1 chars */
printf (" %c , ", i);
else if (!freq1[i])
fputs ("none, ", stdout);
else
fputs ("dupe, ", stdout);
if (freq2[i] == 1) /* check col 2 chars */
printf (" %c\n", i);
else if (!freq2[i])
fputs ("none\n", stdout);
else
fputs ("dupe\n", stdout);
}
Which would produce output as:
$ ./bin/freq_single_dual_col <dat/char2col.txt
lowercase single occurrence, none or dupe:
a , a
b , none
c , c
d , none
e , e
f , none
...
Putting it altogether, your minimal example using fscanf for your read could be similar to:
#include <stdio.h>
#include <limits.h>
#define MAXC UCHAR_MAX+1
int main (int argc, char **argv) {
char c1, c2; /* characters from each line */
int freq1[MAXC] = {0}, freq2[MAXC] = {0}; /* frequency arrays */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (fscanf (fp, " %c,%c", &c1, &c2) == 2) /* read all chars */
if (c1 > 0 || c2 > 0) /* validate ASCII values */
/* increment each element */
freq1[(unsigned char)c1]++, freq2[(unsigned char)c2]++;
if (fp != stdin) fclose (fp); /* close file if not stdin */
puts ("lowercase occurrence:\n");
for (int i = 'a'; i <= 'z'; i++) /* loop over 'a' to 'z' */
printf (" %c: %d, %d\n", i, freq1[i], freq2[i]);
return 0;
}
The example using fgets and sscanf would be similar to:
#include <stdio.h>
#include <limits.h>
#define MAXC UCHAR_MAX+1
int main (int argc, char **argv) {
char c1, c2, /* characters from each line */
buf[MAXC] = ""; /* buffer to hold each line */
int freq1[MAXC] = {0}, freq2[MAXC] = {0}; /* frequency arrays */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (fgets (buf, MAXC, fp)) /* read each line */
if (sscanf (buf, " %c, %c", &c1, &c2) == 2) { /* parse values */
if (c1 > 0 || c2 > 0) /* validate ASCII values */
/* increment each element */
freq1[(unsigned char)c1]++, freq2[(unsigned char)c2]++;
}
else
fputs ("error: in line format.\n", stderr);
if (fp != stdin) fclose (fp); /* close file if not stdin */
puts ("lowercase occurrence:\n");
for (int i = 'a'; i <= 'z'; i++) /* loop over 'a' to 'z' */
printf (" %c: %d, %d\n", i, freq1[i], freq2[i]);
return 0;
}
And if you wanted the more verbose output, then I leave it to you to incorporate it in the code above.
Look things over and let me know if your have further questions.

Add all columns to Sets and check if the Sets are same size of file lines.
remember that Sets ignore duplicates

C copy substring from text file

Say I have the following text file -
name:asdfg
address:zcvxz
,
name:qwerwer
address:zxcvzxcvxz
,
And I wanna copy the name (without "name:") to a certain string variable, the address to another and so on.
How do I do so without corrupting memory?
Tried using (example) -
char buf[50];
while (fgets(buf, 50, file) != NULL) {
if (!strncmp(buf, "name", 4))
strncpy(somestring, buf + 5, 20)
//do the same for address, continue looping
but the text lines differ in length, so it seems to copy all sorts of crap from the buffer, as the strings arent null terminated so it copies "asdfgcrapcrapcrap".

You are to be commended for using fgets to handle your file I/O as it provides a much more flexible and robust way to read, validate and prepare to parse the lines of data you read. It is generally the recommended way to do line-oriented input (either from a file or from the user). However, this is one of those circumstances where treating multiple records as formatted input does have some advantages.
Let's start with an example reading your data file and capturing the name:.... and address:... data in a simple data structure to hold both the name and address data values in a 20-char array for each. Each line is read, the length is validated, the trailing '\n' is removed and then strchr is used to locate the ':' in the line. (we don't care about lines without ':'). The label before ':' is copied to tmp and then compare against "name" or "address" to determine which value to read. Once the address data is read, both name and addr values are printed to stdout,
#include <stdio.h>
#include <string.h>
enum { MAXC = 20, MAXS = 256 };
typedef struct {
char name[MAXC],
addr[MAXC];
} data;
int main (int argc, char **argv) {
char buf[MAXS] = "",
*name = "name", /* name/address literals for comparison */
*addr = "address";
data mydata = { .name = "" };
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
while (fgets (buf, MAXS, fp)) { /* read each line */
char *p = buf, /* pointer to use with strchr */
tmp[MAXC] = ""; /* storage for labels */
size_t len = strlen (buf); /* get buf len */
if (len && buf[len - 1] == '\n') /* validate last char is '\n' */
buf[--len] = 0; /* overwrite with nul-character */
else if (len + 1 == MAXS) { /* handle string too long */
fprintf (stderr, "error: line too long or no '\n'\n");
return 1;
}
if ((p = strchr (buf, ':'))) { /* find ':' in buf */
size_t labellen = p - buf, /* get length of label */
datalen = strlen (p + 1); /* get length of data */
if (labellen + 1 > MAXC) { /* validate both lengths */
fprintf (stderr, "error: label exceeds '%d' chars.\n", MAXC);
return 1;
}
if (datalen + 1 > MAXC) {
fprintf (stderr, "error: data exceeds '%d' chars.\n", MAXC);
return 1;
}
strncpy (tmp, buf, labellen); /* copy label to temp */
tmp[labellen] = 0; /* nul-terminate */
if (strcmp (name, tmp) == 0) /* is the label "name" ? */
strcpy (mydata.name, p + 1);
else if (strcmp (addr, tmp) == 0) { /* is the label "address" ? */
strcpy (mydata.addr, p + 1);
/* record complete -- output results */
printf ("\nname : %s\naddr : %s\n", mydata.name, mydata.addr);
}
}
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
return 0;
}
(note: there are many ways to structure this logic. The example above just represents a semi-standard method)
Example Use/Output
$./bin/nameaddr <dat/nameaddr.txt
name : asdfg
addr : zcvxz
name : qwerwer
addr : zxcvzxcvxz
Here is where I will have a tough time convincing you that fgets was the way to go for this problem. Why? Here we are essentially reading formatted input that is comprised of 3-lines of data. The format string for fscanf doesn't care how many lines are involved, and can easily be constructed to skip '\n' within the formatted input. This can provide (a more fragile), but attractive alternative for the right input files.
For example, the code above can be reduced to the following using fscanf for a formatted read:
#include <stdio.h>
#define MAXC 20
typedef struct {
char name[MAXC],
addr[MAXC];
} data;
int main (int argc, char **argv) {
data mydata = { .name = "" };
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
/* read 3-lines at a time separating name and address at once */
while (fscanf (fp, " name:%19s address:%19s ,",
mydata.name, mydata.addr) == 2)
printf ("\nname : %s\naddr : %s\n", mydata.name, mydata.addr);
if (fp != stdin) fclose (fp); /* close file if not stdin */
return 0;
}
(the output is the same)
In the rare case, for the correct data file, fscanf can provide a viable alternative to a line-oriented read with fgets. However, your first choice should remain a line-oriented approach using either fgets or POSIX getline.
Look both over and let me know if you have further questions.

If the name is 20 characters or longer, strncpy() won't copy the null terminator to the destination string, so you need to add it yourself.
strncpy(somestring, buf + 5, 19);
somestring[19] = '\0';

Simple C program to read a file line by line

What I would like to do is read the whole first line of the file but then after the first line only read the following lines until whitespace is hit. My end goal is to ask the user what line they want to edit by adding/subtracting time to said line.
Sample File
My test file
00:19.1 123456
00:35.4 testing whitespace end
Desired Output
1: My test file
2: 00:19.1
3: 00:35.4
Code:
#include <stdio.h>
#include <stdlib.h>
int main()
{
FILE *fptr1, *fptr2;
char filechar[40];
char c[50];
int line_number = 1;
int replace_line, temp = 1;
printf("Please enter a file name: ");
scanf("%s", &filechar);
if ((fptr1 = fopen(filechar, "r")) == NULL)
{
printf("Error locating desired file");
exit(1);
}
c = getc(fptr1);
while (c != EOF)
{
//printf("%d: %c",line_number, c);
printf("%s",c);
c = getc(fptr1);
//line_number++;
}
return 0;
}

In C you have character oriented input functions (e.g. getchar, fgetc), you have formatted input functions (e.g. the scanf family) and then you have line oriented input functions. (e.g. fgets and POSIX getline). When you are reading lines of data, line oriented input functions are the proper tool for the job. (taking user input with scanf has many pitfalls that new (and even not so new) C programmers fall into)
All line oriented functions read and include the '\n' in the buffer they fill. You can, and should, remove the newline from the resulting buffer if it will be used later on in your code. A simple
size_t n = strlen (buf);
if (buf[n-1] == '\n')
buf[--n] = 0;
is all you need to overwrite the trailing '\n' with a nul-terminating character. If you are just printing the line immediately and not storing it for later use, then it's not worth removing the newline (just account for it in your output format string).
Putting those pieces together, you can read each line, handle the first by simply outputting it, and for each remaining line, parse the time (presumable some elapsed time) from the full string read by fgets with sscanf and format the output as you specify. E.g.
#include <stdio.h>
#define MAXC 64 /* define constants, don't use magic number in code */
int main (int argc, char **argv) {
char buf[MAXC] = ""; /* buffer to hold each line -- size as reqd */
int line = 1;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
while (fgets (buf, sizeof buf, fp)) { /* read each line in file */
char et[MAXC] = ""; /* buffer for holding time */
if (line == 1) /* if 1st line, just print */
printf ("%d : %s", line, buf); /* note: \n included by fgets */
else {
if (sscanf (buf, "%s", et) != 1) { /* parse up to first whitespace */
fprintf (stderr, "error: invalid conversion, line %d\n", line);
return 1;
}
printf ("%d : %s\n", line, et); /* output elapsed time only */
}
line++; /* increment line count */
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
return 0;
}
note: you should protect against buffer overrun on parse by including a field-width specifier in the sscanf format string (e.g. sscanf (buf, "%63s", et), and that is one place that all you can do is include magic numbers in your code because there is no way to directly specify a variable width specifier for sscanf -- unless you creatively use sprintf to create the format string ahead of time -- but that's for another day..
Example Input File
$ cat dat/et.txt
My test file
00:19.1 123456
00:35.4 testing whitespace end
Example Use/Output
$ ./bin/et <dat/et.txt
1 : My test file
2 : 00:19.1
3 : 00:35.4
Look things over and let me know if you have any further questions.
(note: I take the filename as the first argument to the program, or read from stdin if no filename is given. C provides for command line arguments -- use them. It's fine to prompt for input if needed, otherwise, its far easier just to specify arguments on the command line :)

Please try if this C code can help you. It just reads the file line by line and replaces whitespace with the string termination character \0.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
char* replace_char(char* str, char find, char replace){
char *current_pos = strchr(str,find);
while (current_pos){
*current_pos = replace;
current_pos = strchr(current_pos,find);
}
return str;
}
int main(void)
{
FILE * fp;
char * line = NULL;
size_t len = 0;
ssize_t read;
fp = fopen("/home/developer/CLionProjects/untitled4/download.out", "r");
if (fp == NULL)
exit(EXIT_FAILURE);
int count=0;
while ((read = getline(&line, &len, fp)) != -1) {
if (count==0) printf("%s", line);
else printf("%s\n", replace_char(line, ' ', '\0'));
count++;
}
fclose(fp);
if (line)
free(line);
exit(EXIT_SUCCESS);
}
File
My test file
00:19.1 123456
00:35.4 testing whitespace end
Output
My test file
00:19.1
00:35.4

C program to copy .csv of integers copies one less element unless element size is set to +1

I'm new to learning the C language and I wanted to write a simple program that would copy an array integers from one .csv file to a new .csv file. My code works as intended, however when my array size for fread/fwrite is set to the exact number of elements in the .csv array (10 in this case), it only copies nine of the elements.
When the array size is set to +1, it copies all the elements.
#include <stdio.h>
#include <stdlib.h>
#define LISTSIZE 11
//program that copies an array of integers from one .csv to another .csv
int main(int argc, char * argv[])
{
if (argc != 2)
{
fprintf(stderr, "Usage ./file_sort file.csv\n");
return 1;
}
char * csvfile = argv[1];
FILE * input_csvile = fopen(csvfile, "r"); //open .csv file and create file pointer input_csvile
if(input_csvile == NULL)
{
fprintf(stderr, "Error, Could not open\n");
return 2;
}
unsigned int giving_total[LISTSIZE];
if(input_csvile != NULL) //after file opens, read array from .csv input file
{
fread(giving_total, sizeof(int), LISTSIZE, input_csvile);
}
else
fprintf(stderr, "Error\n");
FILE * printed_file = fopen("school_currentfy1.csv", "w");
if (printed_file != NULL)
{
fwrite(giving_total, sizeof(int), LISTSIZE, printed_file); //copy array of LISTSIZE integers to new file
}
else
fprintf(stderr, "Error\n");
fclose(printed_file);
fclose(input_csvile);
return 0;
}
Does this have something to do with the array being 0-indexed and the .csv file being 1-indexed? I also had an output with the LISTSIZE of 11 which had the last (10) element being displayed incorrectly; 480 instead of 4800.
http://imgur.com/lLOozrc Output/input with LISTSIZE of 10
http://imgur.com/IZPGwsA Input/Output with LISTSIZE of 11

Note: as noted in the comment, fread and fwrite are for reading and writing binary data, not text. If you are dealing with a .csv (comma separated values -- e.g. as exported from MS Excel or Open/LibreOffice calc) You will need to use fgets (or any other character/string oriented function) followed by sscanf (or strtol, strtoul) to read the values as text and perform the conversion to int values. To write the values to your output file, use fprintf. (fscanf is also available for input text processing and conversion, but you lose flexibility in handling variations in input format)
However, if your goal was to read binary data for 10 integers (e.g. 40-bytes of data), then fread and fwrite are fine, but as with all input/output routines, you need to validate the number of bytes read and written to insure you are dealing with valid data within your code. (and that you have a valid output data file when you are done)
There are many ways to read a .csv file, depending on the format. One generic way is to simply read each line of text with fgets and then repeatedly call sscanf to convert each value. (this has a number of advantages in handling different spacing around the ',' compared to fscanf) You simply read each line, assign a pointer to the beginning of the buffer read by fgets, and then call sscanf (with %n to return the number of character processed by each call) and then advance the pointer by that number and scan forward in the buffer until your next '-' (for negative values) or a digit is encountered. (using %n and scanning forward can allow fscanf to be used in a similar manner) For example:
/* read each line until LISTSIZE integers read or EOF */
while (numread < LISTSIZE && fgets (buf, MAXC, fp)) {
int nchars = 0; /* number of characters processed by sscanf */
char *p = buf; /* pointer to line */
/* (you should check a whole line is read here) */
/* while chars remain in buf, less than LISTSIZE ints read
* and a valid conversion to int perfomed by sscanf, update p
* to point to start of next number.
*/
while (*p && numread < LISTSIZE &&
sscanf (p, "%d%n", &giving_total[numread], &nchars) == 1) {
numread++; /* increment the number read */
p += nchars; /* move p nchars forward in buf */
/* find next digit in buf */
while (*p && *p != '-' && (*p < '0' || *p > '9'))
p++;
}
}
Now to create your output file, you simply write numread values back out in comma separated value format. (you can adjust how many your write per line as required)
for (i = 0; i < numread; i++) /* write in csv format */
fprintf (fp, i ? ",%d" : "%d", giving_total[i]);
fputc ('\n', fp); /* tidy up -- make sure file ends with '\n' */
Then it is just a matter of closing your output file and checking for any stream errors (always check on close when writing values to a file)
if (fclose (fp)) /* always validate close after write to */
perror("error"); /* validate no stream errors occurred */
Putting it altogether, you could do something similar to the following:
#include <stdio.h>
#include <stdlib.h>
#define LISTSIZE 10
#define MAXC 256
int main(int argc, char *argv[])
{
if (argc < 3) {
fprintf(stderr, "Usage ./file_sort file.csv [outfile]\n");
return 1;
}
int giving_total[LISTSIZE]; /* change to int to handle negative values */
size_t i, numread = 0; /* generic i and number of integers read */
char *csvfile = argv[1],
buf[MAXC] = ""; /* buffer to hold MAXC chars of text */
FILE *fp = fopen (csvfile, "r");
if (fp == NULL) { /* validate csvfile open for reading */
fprintf(stderr, "Error, Could not open input file.\n");
return 2;
}
/* read each line until LISTSIZE integers read or EOF */
while (numread < LISTSIZE && fgets (buf, MAXC, fp)) {
int nchars = 0; /* number of characters processed by sscanf */
char *p = buf; /* pointer to line */
/* (you should check a whole line is read here) */
/* while chars remain in buf, less than LISTSIZE ints read
* and a valid conversion to int perfomed by sscanf, update p
* to point to start of next number.
*/
while (*p && numread < LISTSIZE &&
sscanf (p, "%d%n", &giving_total[numread], &nchars) == 1) {
numread++; /* increment the number read */
p += nchars; /* move p nchars forward in buf */
/* find next digit in buf */
while (*p && *p != '-' && (*p < '0' || *p > '9'))
p++;
}
}
if (numread < LISTSIZE) /* warn if less than LISTSIZE integers read */
fprintf (stderr, "Warning: only '%zu' integers read from file", numread);
fclose (fp); /* close input file */
fp = fopen (argc > 2 ? argv[2] : "outfile.csv", "w"); /* open output file */
if (fp == NULL) { /* validate output file open for writing */
fprintf(stderr, "Error, Could not open output file.\n");
return 3;
}
for (i = 0; i < numread; i++) /* write in csv format */
fprintf (fp, i ? ",%d" : "%d", giving_total[i]);
fputc ('\n', fp); /* tidy up -- make sure file ends with '\n' */
if (fclose (fp)) /* always validate close after write to */
perror("error"); /* validate no stream errors occurred */
return 0;
}
Like I said, there are many, many ways to approach this. The idea is to build in as much flexibility to your read as possible so it can handle any variations in the input format without choking. Another very robust way to approach the read is using strtol (or strtoul for unsigned values). Both allow will advance a pointer for you to the next character following the integer converted so you can start your scan for the next digit from there.
An example of the read flexibility provide in either of these approaches is shown below. Reading a file of any number of lines, with values separate by any separator and converting each integer encountered to a value in your array, e.g.
Example Input
$ cat ../dat/10int.csv
8572, -2213, 6434, 16330, 3034
12346, 4855, 16985, 11250, 1495
Example Program Use
$ ./bin/fgetscsv ../dat/10int.csv dat/outfile.csv
Example Output File
$ cat dat/outfile.csv
8572,-2213,6434,16330,3034,12346,4855,16985,11250,1495
Look things over and let me know if you have questions. If your intent was to read 40-bytes in binary form, just let me know and I'm happy to help with an example there.
If you want a truly generic read of values in a file, you can tweak the code that finds the number in the input file to scan forward in the file and validate that any '-' is followed by a digit. This allows reading any format and simply picking the integers from the file. For example with the following minor change:
while (*p && numread < LISTSIZE) {
if (sscanf (p, "%d%n", &giving_total[numread], &nchars) == 1)
numread++; /* increment the number read */
p += nchars; /* move p nchars forward in buf */
/* find next number in buf */
for (; *p; p++) {
if (*p >= '0' && *p <= '9') /* positive value */
break;
if (*p == '-' && *(p+1) >= '0' && *(p+1) <= '9') /* negative */
break;
}
}
You can easily process the following file and obtain the same results:
$ cat ../dat/10intmess.txt
8572,;a -2213,;--a 6434,;
a- 16330,;a
- The Quick
Brown%3034 Fox
12346Jumps Over
A
4855,;*;Lazy 16985/,;a
Dog.
11250
1495
Example Program Use
$ ./bin/fgetscsv ../dat/10intmess.txt dat/outfile2.csv
Example Output File
$ cat dat/outfile2.csv
8572,-2213,6434,16330,3034,12346,4855,16985,11250,1495

how to find tail of a file using c code using 2D array

I want to print last n lines of a file using a c program.
I have already used the method of fseek. Now, I want to try it by using array. I have written the code, but, it gives a segmentation fault (core dumped) error.
Please help to modify this code:
#include <stdio.h>
#include <stdlib.h>
char s[10][100];
int main(int argc, char *argv[])
{
FILE *in, *out;
int count = 0;
long pos;
char c;
if ((in = fopen("count.txt", "r")) == NULL)
{
perror("fopen");
exit(EXIT_FAILURE);
}
if ((out = fopen("output.txt", "w")) == NULL)
{
printf("error in opening file");
exit(EXIT_FAILURE);
}
if (argc < 2)
fprintf(stderr, "Arguments which is to be passed should be 2\n");
else if (argc > 2)
printf("too many argumnets are passed");
else
{
int n = atoi(argv[1]);
if(n >= 1 && n< 100)
{
int j,i;
for(i=0;i<=n;i++)
{
j=0;
c = fgetc(in);
for (; j != EOF; ++j)
{
s[i][j]=c;
fputc(c, out);
c = fgetc(in);
if(s[i][j]=='\n')
break;
}
if(s[i][j] != EOF && i== n)
i=0;
}
for (i = 0; i <= n; i++)
for (j = 0; s[i][j] != '\n'; j++)
printf("%c", s[i][j]);
}
else
printf("renter the value of n");
}
fclose(in);
fclose(out);
return 0;
}

Here are some problems you can look into:
char s[10][100]; See if the sizes match the way you're using s : s[i][j]=c;
for(i=0;i<=n;i++) Are you printing n or n+1 lines ?
while (j != EOF) Inside the loop you're not setting j from the file
Plus: the following assignment probably does more harm than good:
if(s[i][j] != EOF && i== n)
i=0;

While you are free to approach the reading of lines one-character-at-a-time using character-oriented input functions, you are making your job a bit more difficult than it needs to be. C offers several functions that provide line-oriented input (e.g. fgets and getline), that are much better suited for reading text one line-at-a-time.
For your purposes here, it seems you need to read/store all lines of input, then based upon a user given number (say 'n'), write/display the last 'n' lines of input to your given output file. While dynamically allocating storage to accommodate an unlimited number of lines or characters does not take that much more effort, below we will use a statically declared array with a MAXL maximum number of lines, each containing a MAXC maximum number of characters. MAXL and MAXC will be constants for the code specified using an anonymous enum. The alternative to using an enum is to #define both as constants.
Rather than hardcoding filenames and the number of lines to display, you can easily pass that information as arguments to your program, making use of getopt to pick off the number of end (or tail) lines to display. Any remaining arguments will be considered input/output filenames (using stdin and stdout as defaults if no additional filenames are provided as arguments).
Below, two helper functions xfopen and processopts are used simply to move the error checking for fopen and option processing with getopt into functions to keep the main body of the code readable.
Using line-oriented input greatly simplifies the reading and storing of lines in your array. The only twist with line-oriented input functions is that they read-and-include the trailing '\n' (newline) character as part of their input. Meaning your only additional task is to remove the trailing '\n' by simply overwriting it with a nul-terminating character '\0' (or simply 0). So when reading lines with fgets, you will generally see something similar to:
while (fgets (array[idx], MAXC, ifp)) /* read each line of input */
{
size_t len = strlen (array[idx]); /* get length - used below */
while (len && (array[idx][len-1] == '\r' || array[idx][len-1] == '\n'))
array[idx][--len] = 0; /* strip trailing '\r' & '\n' */
if (++idx == MAXL) { /* test if line limit reached */
fprintf (stderr, "warning: MAXL lines read.\n");
break;
}
}
Which simply reads a line of information into the buffer array[idx], including a maximum of MAXC characters (including the nul-terminating character) from the file stream ifp. The length of the string read is found with strlen and then working backwards using the length as an index each '\n' (or '\r\n' on windows) is overwritten with a nul-terminating character. Lastly the line index idx is incremented and tested against the constant MAXL to insure you don't write beyond the end of the array. (if array was dynamically allocated, you would realloc here)
That is literally all that is required to do line-oriented input. You can efficiently read as many or as few lines as needed simply by calling, or repeatedly calling, fgets once per line and then removing the line-endings (there are some addition checks you can do to insure a complete line was read, etc.. not relevant to this discussion - here any part of a line that exceeds MAXC chars will simply be read as the next line.)
Putting all the pieces together you could do something like the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
enum { MAXL = 64, MAXC = 128 }; /* constants for max lines and max chars */
FILE *xfopen (const char *fn, const char *mode); /* helper for fopen */
void processopts (int argc, char **argv, size_t *nlines, FILE **ifp, FILE **ofp);
int main (int argc, char **argv) {
char array[MAXL][MAXC] = {{0}}; /* array to hold MAXL strings */
size_t i, idx = 0, nlines = 0; /* index, num lines to print */
FILE *ifp, *ofp;
processopts (argc, argv, &nlines, &ifp, &ofp); /* handle option */
while (fgets (array[idx], MAXC, ifp)) /* read each line of input */
{
size_t len = strlen (array[idx]); /* get length - used below */
while (len && (array[idx][len-1] == '\r' || array[idx][len-1] == '\n'))
array[idx][--len] = 0; /* strip trailing '\r' & '\n' */
if (++idx == MAXL) { /* test if line limit reached */
fprintf (stderr, "warning: MAXL lines read.\n");
break;
}
}
if (ifp != stdin) fclose (ifp); /* close inputfile */
if (nlines >= idx) nlines = 0; /* validate nlines */
if (nlines) { /* tail only nlines lines */
for (i = idx - nlines; i < idx; i++)
fprintf (ofp, "%s\n", array[i]);
}
else { /* print all lines */
for (i = 0; i < idx; i++)
fprintf (ofp, "%s\n", array[i]);
}
if (ofp != stdout) fclose (ofp); /* close output fp */
return 0;
}
/* fopen with error checking */
FILE *xfopen (const char *fn, const char *mode)
{
FILE *fp = fopen (fn, mode);
if (!fp) {
fprintf (stderr, "xfopen() error: file open failed '%s'.\n", fn);
// return NULL;
exit (EXIT_FAILURE);
}
return fp;
}
void processopts (int argc, char **argv, size_t *nlines, FILE **ifp, FILE **ofp)
{
int opt; /* used with getopt to parse */
/* process '-n X' option to set number of tail lines */
while ((opt = getopt (argc, argv, "n:")) != -1) {
switch (opt) {
case 'n':
*nlines = atoi (optarg);
break;
}
}
/* infile/outfile are remaining args (default: stdin/stdout) */
*ifp = argc > optind ? xfopen (argv[optind], "r") : stdin;
*ofp = argc > optind + 1 ? xfopen (argv[optind + 1], "w") : stdout;
}
Input File
$ cat dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
Reading/Writing all to stdin/stdout
$ ./bin/fgets_array_static_opt <dat/captnjack.txt
This is a tale
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
Tailing last 2 lines to stdout
$ ./bin/fgets_array_static_opt <dat/captnjack.txt -n2
A Pirate So Brave
On the Seven Seas.
Tailing last 3 lines to foo.txt
$ ./bin/fgets_array_static_opt dat/captnjack.txt -n 3 debug/foo.txt
$ cat debug/foo.txt
Of Captain Jack Sparrow
A Pirate So Brave
On the Seven Seas.
Let me know if you have any questions.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Use fscanf to read strings and empty lines - c

Related

Check if all letters of the alphabet appear in a file with no repeats

C copy substring from text file

Simple C program to read a file line by line

C program to copy .csv of integers copies one less element unless element size is set to +1

how to find tail of a file using c code using 2D array

Categories

Resources