IO results. Tweak the code [duplicate] - c

This question already has answers here:
reading a text file into an array
(2 answers)
Closed 8 years ago.
I need some help with an assignment. This is what I have so far:
#include <stdio.h>
#include <stdlib.h>
#define MAXN 100
int main(){
int ch = 0;
FILE *fi = NULL;
FILE *fo = NULL;
int numo = 0;
int numi = 0;
int nump = 0;
fo = fopen("OutputFile.txt", "w+");
if (!fo) {
perror ("Error while opening the file.\n");
exit (EXIT_FAILURE);
}
fi = fopen("InputFile.txt","r");
if(!fi){
perror("Error while opening the file.\n");
exit(EXIT_FAILURE);
}
printf("\n The contents of %s file are :\n", "InputFile.txt");
while( ( ch = fgetc(fi) ) != EOF )
printf("%c",ch);
numi = ch;
numo = numi + 8;
fprintf (fo, " %d\n", numo);
if (fo) fclose (fo);
return 0;
}
Edit 3. Scrapped the array idea since it's causing me more trouble then success. I reduced the list to just one line in the inputfile.txt. The math is simple so I can see what I'm doing and where its going wrong. I've got everything working for the most part, it's just a bit glitched.
First off, The program will read the file just fine and display it in the program. The problem comes after that point and the point the results are saved into OutputFile.txt. Depending on which % I use (%s, %i, %d, %c) The result in OutputFile.txt is either -1, a character, or a longer list of numbers.
How do I get the number from InputFile.txt and save the number to numi?
This is what it looks like
The contents of InputFile.txt are: 10010110
numo=ch+8
Result (Numo) save to OutputFile.txt
When I open the OutputFile.txt the line read 7. So for some reason CH = -1 (Which I want it to equal 10010110) And I'm not sure where the -1 is coming from.

There are many ways to put the pieces of the puzzle together. Finding the tool for the job is 1/2 the battle. In this case strtol will covert base 2 to decimal. The key is to recognize, there is no reason to do character input and you can simplify your code using line-oriented input which will provide your data in a format ready for conversion.
Below are the pieces of the puzzle. They are left somewhat out of order so that you can work to rearrange them to produce a final output file with both the string and the decimal value contained within it. You will probably want to open your output file before you read the text file so that both file steams are available during your read loop.
Take a look and let me know if you have any questions. Note: this is just one of many, many ways to approach this problem:
#include <stdio.h>
#include <stdlib.h>
#define MAXN 100
int main () {
char file_input[25] = { 0 }; /* always initialize all variables */
char file_output[25] = { 0 };
FILE *fi = NULL;
FILE *fo = NULL;
int integers[MAXN] = { 0 };
int i = 0;
int num = 0;
printf ("\n Please enter the input filename: ");
while (scanf ("%[^\n]%*c", file_input) != 1)
fprintf (stderr, "error: read failed for 'file_input', try again\n filename: ");
fi = fopen (file_input, "r"); /* open input file and validate */
if (!fi) {
perror ("Error while opening the file.\n");
exit (EXIT_FAILURE);
}
printf ("\n The contents of file '%s' are :\n\n", file_input);
char *line = NULL; /* NULL forces getline to allocate */
size_t n = 0; /* max chars to read (0 - no limit */
ssize_t nchr = 0; /* number of chars actually read */
while ((nchr = getline (&line, &n, fi)) != -1) {
if (line[nchr - 1] == '\n')
line[--nchr] = 0; /* strip newline from end of line */
integers[i] = strtol (line, NULL, 2); /* convert to decimal */
printf (" %s -> %d\n", line, integers[i]);
if (i == MAXN - 1) { /* check MAXN limit not exceeded */
fprintf (stderr, "error: input lines exceed %d\n", MAXN);
exit (EXIT_FAILURE);
}
i++;
}
if (line) free(line); /* free memory allocated by getline */
if (fi) fclose (fi); /* close file stream when done */
num = i; /* save number of elements in array */
printf ("\n Conversion complete, output filename: ");
while (scanf ("%[^\n]%*c", file_output) != 1)
fprintf (stderr, "error: read failed for 'file_output', try again\n filename: ");
fo = fopen (file_output, "w+"); /* open output file & validate */
if (!fo) {
perror ("Error while opening the file.\n");
exit (EXIT_FAILURE);
}
for (i = 0; i < num; i++) /* write integers to output file */
fprintf (fo, " %d\n", integers[i]);
if (fo) fclose (fo);
return 0;
}
Use/output:
$ ./bin/arrayhelp
Please enter the input filename: dat/binin.txt
The contents of file 'dat/binin.txt' are :
01000101 -> 69
11010110 -> 214
11101110 -> 238
Conversion complete, output filename: dat/binout.txt
$ cat dat/binout.txt
69
214
238
Reading Character-by-character
While this isn't the easiest way to handle reading the file, there is nothing wrong with it. However, you have logic problems. Specifically, you read (and assign as integer) numi = ch; and then assign numo = numi + 8; to write to your output file. This results in adding 8 to the ASCII value of '0' (48) or '1' (49). If you add 8 to that, well, you can do the math. When you read from a file as text, you are reading the ASCII value, NOT the numeric value 1 or 0.
In order to accomplish what you appear to be attempting, you must save all characters in a line to a buffer (a string, a character array, I don't care what you call it). That is the only way, (short of doing a character-by-character conversion to a numeric 1 or 0 and then preforming the binary addition), you have to convert the string of '0's and '1's to a decimal value.
Here is an example using the character-by-character read from fi. Read though it and understand why it needs to be done this way. If you have questions, drop another comment.
#include <stdio.h>
#include <stdlib.h>
#define MAXN 100
int main () {
int ch = 0;
FILE *fi = NULL;
FILE *fo = NULL;
// int numo = 0;
// int numi = 0;
// int nump = 0;
char buffer[MAXN] = { 0 }; /* buffer to hold each line */
int idx = 0; /* index for buffer */
fo = fopen ("OutputFile.txt", "w+"); /* open output file & validate */
if (!fo) {
perror ("Error while opening the file.\n");
exit (EXIT_FAILURE);
}
fi = fopen ("InputFile.txt", "r"); /* open input file & validate */
if (!fi) {
perror ("Error while opening the file.\n");
exit (EXIT_FAILURE);
}
printf ("\n The contents of %s file are :\n\n", "InputFile.txt");
fprintf (fo, " binary decimal\n"); /* header for output file */
while (1) /* loop and test for both '\n' and EOF (-1) to parse file */
{
// printf ("%c", ch); /* we will store each ch in line in buffer */
if ((ch = fgetc (fi)) == '\n' || ch == EOF)
{
if (ch == EOF && idx == 0) /* if EOF (-1) & buffer empty exit loop */
break;
buffer[idx] = 0; /* null-terminate buffer (same as '\0' ) */
idx = 0; /* reset index for next line & continue */
/* write original value & conversion to fo */
fprintf (fo, " %s => %ld\n", buffer, strtol (buffer, NULL, 2));
/* write fi contents to stdout (indented) */
printf (" %s\n", buffer);
}
else
{
buffer[idx++] = ch; /* assign ch to buffer, then increment idx */
}
/* This makes no sense. You are reading a character '0' or '1' from fi,
the unsigned integer value is either the ASCII value '0', which is
decimal 48 (or hex 0x30), or the ASCII value '1', decimal 49/0x31.
If you add 8 and write to 'fo' with '%d' you will get a 16-digit
string of a combination of '56' & '57', e.g. 56575756....
numi = ch;
numo = numi + 8;
*/
}
if (fi) /* close both input and output file streams */
fclose (fi);
if (fo)
fclose (fo);
return 0;
}
output to stdout:
$ ./bin/arrayhelp2
The contents of InputFile.txt file are :
01000101
11010110
11101110
OutputFile.txt:
$ cat OutputFile.txt
binary decimal
01000101 => 69
11010110 => 214
11101110 => 238

Related

Reading a text file to extract coordinates in c

i am trying to read a text file and extract the cordinates of 'X' so that i can place then on a map
the text file is
10 20
9 8 X
2 3 P
4 5 G
5 6 X
7 8 X
12 13 X
14 15 X
I tried multiple times but I am unable to extract the relevant data and place it in separate variables to plot
I am quite new to c and am trying to learn things so any help is appreciated
thanks in advance
From my top comments, I suggested an array of point structs.
Here is your code refactored to do that.
I changed the scanf to use %s instead of %c for the point name. It generalizes the point name and [probably] works better with the input line because [I think] the %c would not match up correctly.
It compiles but is untested:
#include <stdio.h>
#include <stdlib.h>
struct point {
int x;
int y;
char name[8];
};
struct point *points;
int count;
int map_row;
int map_col;
void
read_data(const char *file_name)
{
FILE *fp = fopen(file_name, "r");
if (fp == NULL) {
/* if the file opened is empty or has any issues, then show the error */
perror("File Error");
return;
}
/* get the dimensions from the file */
fscanf(fp, "%d %d", &map_row, &map_col);
map_row = map_row + 2;
map_col = map_col + 2;
while (1) {
// enlarge dynamic array
++count;
points = realloc(points,sizeof(*points) * count);
// point to place to store data
struct point *cur = &points[count - 1];
if (fscanf(fp, "%d %d %s", &cur->x, &cur->y, cur->name) != 3)
break;
}
// trim to amount used
--count;
points = realloc(points,sizeof(*points) * count);
fclose(fp);
}
There are a number of way to approach this. Craig has some very good points on the convenience of using a struct to coordinate data of different types. This approach reads with fgets() and parses the data you need with sscanf(). The benefit eliminates the risk of a matching-failure leaving characters unread in your input stream that will corrupt the remainder of your read from the point of matching-failure forward. Reading with fgets() you consume a line of input at a time, and that read is independent of the parsing of values with sscanf().
Putting it altogether and allowing the filename to be provided by the first argument to the program (or reading from stdin by default if no argument is provided), you can do:
#include <stdio.h>
#define MAXC 1024 /* if you need a constand, #define one (or more) */
int main (int argc, char **argv) {
char buf[MAXC]; /* buffer to hold each line */
int map_row, map_col; /* map row/col variables */
/* use filename provided as 1st argument (stdin if none provided) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open");
return 1;
}
/* read/validate first line saving into map_row, map_col */
if (!fgets (buf, MAXC, fp) ||
sscanf (buf, "%d %d", &map_row, &map_col) != 2) {
fputs ("error: EOF or invalid map row/col data.\n", stderr);
return 1;
}
/* loop reading remaining lines, for used as line counter */
for (size_t i = 2; fgets (buf, MAXC, fp); i++) {
char suffix;
int x, y;
/* validate parsing x, y, suffix from buf */
if (sscanf (buf, "%d %d %c", &x, &y, &suffix) != 3) {
fprintf (stderr, "error: invalid format line %zu.\n", i);
continue;
}
if (suffix == 'X') { /* check if line suffix is 'X' */
printf ("%2d %2d %c\n", x, y, suffix);
}
}
if (fp != stdin) { /* close file if not stdin */
fclose (fp);
}
}
(note: this just illustrates the read and isolation of the values from lines with a suffix of 'X'. Data handling, and calculations are left to you)
Example Use/Output
With your data in dat/coordinates.txt you could do:
$ ./bin/readcoordinates dat/coordinates.txt
9 8 X
5 6 X
7 8 X
12 13 X
14 15 X
As Craig indicates, if you need to store your matching data, then an array of struct provides a great solution.

Read a line from file in C and extract the number of input

I have a file input.dat. In this file, there are 3 lines:
1 2 3
5 7 10 12
8 9 14 13 15 17
I am going to read one of the three lines using C, and return the number of the elements.
For example, I want to read the 2nd line 5 7 10 12 into memory, and also return the number of values in the 2nd line, which is 4. My code is below...
#include <stdio.h>
#include <stdlib.h>
#define STRING_SIZE 2000
int main() {
FILE *fp = fopen("in.dat", "r");
char line[STRING_SIZE];
int lcount = 0, nline = 1, sum = 0, number;
if (fp != NULL) {
while (fgets(line, STRING_SIZE, fp) != NULL) {
if (lcount == nline) {
while (sscanf(line, "%d ", &number)) {
sum++;
}
break;
} else {
lcount++;
}
}
fclose(fp);
}
exit(0);
}
When I run this code, it never stops like a dead loop. What is the problem here?
the loop while (sscanf(line, "%d ", &number)) keeps parsing the first number in the line.
You should use strtol instead:
#include <stdio.h>
#include <stdlib.h>
#define STRING_SIZE 2000
int main() {
FILE *fp = fopen("in.dat", "r");
char line[STRING_SIZE];
int lcount = 0, nline = 1;
if (fp != NULL) {
while (fgets(line, STRING_SIZE, fp) != NULL) {
if (lcount == nline) {
char *p = line, *q;
int count = 0;
for (;;) {
long val = strtol(p, &q, 0); // parse an integer
if (q == p) {
// end of string or not a number
break;
}
// value was read into val. You can use it for whatever purpose
count++;
p = q;
}
printf("%d\n", count);
break;
} else {
lcount++;
}
}
fclose(fp);
}
return 0;
}
You were thinking along the right path with your use of sscanf(), the only piece of the puzzle you were missing is how to apply an offset to line so that you read the next value in the line with your next call to sscanf(). You do that by keeping track of the number of characters consumed on each call to sscanf() using the "%n" conversion (it does not add to your conversion count returned by sscanf()) For example reading lines from the open file-stream fp, you could do:
#define MAXC 1024 /* if you need a constant, #define one (or more) */
...
char line[MAXC] = ""; /* buffer to hold each line */
...
while (fgets (line, MAXC, fp)) { /* reach each line in file */
int offset = 0, /* offset in line for next sscanf() read */
nchr = 0, /* number of char consumed by last read */
val, /* integer value read with sscanf() */
nval = 0; /* number of values read in line */
/* conververt each integer at line + offset, saving no. of chars consumed */
while (sscanf (line + offset, "%d%n", &val, &nchr) == 1) {
printf (" %d", val); /* output value read */
offset += nchr; /* update offset with no. chars consumend */
nval++; /* increment value count */
}
printf (" - %d values\n", nval); /* output no. values in line */
}
(Note: strtol() provides better error reporting than sscanf() on a failed conversion)
If you put it together with an example that reads from the filename provided as the first argument to the program (or reads from stdin by default if no argument is given), you could do:
#include <stdio.h>
#define MAXC 1024 /* if you need a constant, #define one (or more) */
int main (int argc, char **argv) {
char line[MAXC] = ""; /* buffer to hold each line */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (fgets (line, MAXC, fp)) { /* reach each line in file */
int offset = 0, /* offset in line for next sscanf() read */
nchr = 0, /* number of char consumed by last read */
val, /* integer value read with sscanf() */
nval = 0; /* number of values read in line */
/* conververt each integer at line + offset, saving no. of chars consumed */
while (sscanf (line + offset, "%d%n", &val, &nchr) == 1) {
printf (" %d", val); /* output value read */
offset += nchr; /* update offset with no. chars consumend */
nval++; /* increment value count */
}
printf (" - %d values\n", nval); /* output no. values in line */
}
if (fp != stdin) /* close file if not stdin */
fclose (fp);
}
Example Use/Output
With the data you show in the filename dat/nvals.txt you would get:
$ ./bin/fgetsnvals dat/nvals.txt
1 2 3 - 3 values
5 7 10 12 - 4 values
8 9 14 13 15 17 - 6 values
Look things over and let me know if you have further questions.
A bit cleaner version of chqrlie's answer. Started with a string as that is what the question is really about after fgets().
sscanf() won't step through the string, it always reads from the beginning.
strtol() looks for a long int in the beginning of the string, ignoring initial white-space. Gives back the address of where it stops scanning.
The manual of strtol() says that errno should be checked for any conversion error.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#define STRING_SIZE 2000
int main(void)
{
char line[STRING_SIZE] = "5 7 10 12";
char* start = line;
char* end;
int count = 0;
while(1)
{
/**
* strtol() look for long int in beginning of the string
* Ignores beginning whitespace
*
* start: where to strtol() start looking for long int
* end: where strtol() stops scanning for long int
*/
errno = 0; // As strol() manual says
strtol(start, &end, 0);
if (errno != 0)
{
printf("Error in strtol() conversion.\n");
exit(0);
}
if (start == end) break; // Quit loop
start = end;
count++;
}
printf("count: %d\n", count);
return 0;
}

Read Text File with blank lines

The Text File was writable with 1 blank line after the struct.
Is similar to:
1 111 1 Peter
22 22 2 John Lays
3 3 3 Anne Belgs
The struct is:
struct estruturaCarro {
int id, potencia, avariado;
char name[11];
} carro;
I have write the data to the text File:
fprintf(fp, "%-2d %-3d %-1d %-10s \n\n", carro.id, carro.potencia, carro.avariado, carro.name);
I read the file data this way, but I'm sure it's not the best way:
while(true){
int result = fscanf(fp, "%d %d %d %10[^\n]", &carro.id, &carro.potencia, &carro.avariado, &carro.name);
if(result==4){
printf("%-2d %-3d %-1d % -s\n", carro.id, carro.potencia, carro.avariado, carro.name);
}else break;
}
How can I read the text file, saving the data struct, without reading the blank lines?
If I want to validate the values ​​read (ID = xx, potencia = xxx, avariado = x, name = 10 characters), is it better, before or after filling the array of struct?The format of each line of the file is:xx xxx x aaaaaaaaaa (x = digits, a = characters)For example, if one of the lines is
9 123 4 Error 1st value
stop reading the file (informing the user), because the line should be:
9 123 4 Error 1st value (one space was missing after 9) - or because NAME has more than 10 characters.
You don't show how true is set in while(true), but it must be based on the return of fscanf in the way you are approaching it. You must test three cases:
fscanf return is EOF, read done, true should be set to FALSE to break loop;
fscanf return less than 4, matching or input failure, no guarantee this occurred due to an empty line; and
fscanf return is equal to 4, good input.
Unless you are checking all three, you cannot cover all cases for fscanf and further, if there is any additional or extraneous character in your input file, your formatted read will fail.
That is why a better option is to read each line into a buffer with fgets and then parse the information you need from the buffer itself with sscanf. This provides the benefit of allowing separate validation of (1) the read; and (2) the parse of information. Further, since you are consuming a line-at-a-time, there is no uncertainty on what remains in the input stream and a failure in parsing any one line does not prevent a successful read of the remainder.
A short implementation with fgets() and sscanf() reading each struct into an array-of-struct, you can do something similar to the following:
#include <stdio.h>
#define MAXN 11 /* if you need a constant, #define one (or more) */
#define MAXC 1024 /* (don't skimp on buffer size) */
typedef struct { /* use a simple typedef */
int id, potencia, avariado;
char name[MAXN];
} carro_t;
int main (int argc, char **argv) {
char buf[MAXC]; /* buffer to read line */
carro_t carro[MAXN] = {{ .id = 0 }}; /* array of struct */
size_t ndx = 0; /* array index */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (ndx < MAXN && fgets (buf, MAXC, fp)) { /* read each line */
carro_t tmp = { .id = 0 }; /* temp struct */
if (*buf == '\n') /* if line empty */
continue; /* get next line */
if (sscanf (buf, "%d %d %d %10[^\n]", /* separate/validate */
&tmp.id, &tmp.potencia, &tmp.avariado, tmp.name) == 4)
carro[ndx++] = tmp; /* add to array of struct */
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
for (size_t i = 0; i < ndx; i++)
printf ("%3d %3d %3d %s\n",
carro[i].id, carro[i].potencia, carro[i].avariado, carro[i].name);
return 0;
}
(note: the filename is provide as the first argument to the program, or if no filename is provided, read from stdin by default)
While with your particular data file, there is no reason you cannot use fscanf, it is fragile and one character too many (like "Anne Belgss") will cause it to break. A fscanf implementation removing the true and simply looping as you have could be:
for (;;) {
carro_t tmp = { .id = 0 };
if (fscanf (fp, "%d %d %d %10[^\n]", /* separate/validate */
&tmp.id, &tmp.potencia, &tmp.avariado, tmp.name) == 4)
carro[ndx++] = tmp; /* add to array of struct */
else
break;
}
Either way, you will produce the same output with "this" input file, e.g.
Example Use/Output
$ ./bin/readwblanks ~/tmpd/file
1 111 1 Peter
22 22 2 John Lays
3 3 3 Anne Belgs
just use simple reading way is ok,like this.
while(fscanf(fp,"%d %d %d %s",&carro.id,&carro.potencia,&carro.avariado,carro.name)!=EOF)
printf("%d %d %d %s\n",carro.id,carro.potencia,carro.avariado,carro.name);
result is like this
enter image description here

Check if all letters of the alphabet appear in a file with no repeats

I'm writing a program that needs to read a CSV file and check that all letters in the alphabet appear one time on each side of the comma. The file would look something like this:
a,x
b,j
c,g
d,l
e,s
f,r
g,u
h,z
i,w
j,c
k,e
l,a
m,v
but there would be 26 lines total. What would be the most efficient way to check that each side has all 26 letters with no repeats?
While it is unclear from your question and follow-up comments where exactly you are stuck, or whether you have thrown in the towel and given up, let's take it from the beginning.
Open Your File (or reading stdin)
Before you can do anything with the content of your file, you need to open you file for reading. For reading formatted-input you will generally use the functions that read and write from a file stream using a FILE * stream pointer (as opposed to the low-level file-descriptor file interface). To open your file, you will call fopen and check the return to validate the open succeeded.
Do not hard-code filenames or numbers in your program. Your program takes arguments, either pass the filename to open as an argument, or prompt for entry of the filename. You can increase the flexibility of your program by taking the filename to read as an argument, or read from stdin by default if no argument is provided (as most Linux utilities do). Since stdin is a file stream, you can simply assign it to your FILE* pointer if you are not opening a filename provided as an argument. For example:
FILE *fp = NULL;
if (argc > 1) /* if one argument provided */
fopen (argv[1], "r"); /* open file with name from argument */
else
fp = stdin; /* set fp to stdin */
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
which can be shortened using the ternary operator, e.g.:
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
Reading Your Data
With a file-stream open and validated, you can now read your data from your file. While you could read with fscanf, you are limited in the information it provides in the event less that two values are read. Additionally, reading with the scanf family of functions is full of pitfalls due to what characters remain in your input file-stream depending on the conversion specifiers used and on whether the conversion succeeded or failed. Nonetheless, a simple approach that validates two conversions took place according to your format-string will allow you to read your file, e.g.
char c1, c2; /* characters from each line */
int freq1[MAXC] = {0}, freq2[MAXC] = {0}; /* frequency arrays */
...
while (fscanf (fp, " %c, %c", &c1, &c2) == 2) /* read all chars */
if (c1 > 0 || c2 > 0) /* validate ASCII values */
/* increment element in each */
freq1[(unsigned char)c1]++, freq2[(unsigned char)c2]++;
(the downside is any variation in format on one line can leave you with unwanted characters stored, and if less than two conversions take place, your read loop stops despite valid data remaining unread)
A better approach is to read a line-at-a-time with a line-oriented input function such as fgets or POSIX getline. With this approach, you are consuming a line of data at a time, and then parsing the needed information from the stored line. The benefits are significant. You have an independent validation of the read itself, and then whether you find the needed values in the line. If your format varies an you parse less than the needed values from the line, you have the option of simply skipping that line and continuing with the next. Further, what remains in your input file stream does not depend on the conversion specifiers used.
An example with fgets and sscanf doing the same thing would be:
char c1, c2, /* characters from each line */
buf[MAXC] = ""; /* buffer to hold each line */
...
while (fgets (buf, MAXC, fp)) /* read all chars */
if (sscanf (buf, " %c, %c", &c1, &c2) == 2) { /* parse values */
if (c1 > 0 || c2 > 0) /* validate ASCII values */
/* increment element in each */
freq1[(unsigned char)c1]++, freq2[(unsigned char)c2]++;
}
else
fputs ("error: in line format.\n", stderr);
Handling the Frequency of Characters
If you have been paying attention to the read of the data from the file, you will note that a pair of frequency arrays have been incremented on each read of the characters freq1 and freq2. As mentioned in my comments above, you start with an adequately sized array of int to hold the ASCII character set. The arrays are initialized to zero. When you read a character from each column, you simply increment the value at:
if (c1 > 0 || c2 > 0) /* validate ASCII values */
/* increment each element */
freq1[(unsigned char)c1]++, freq2[(unsigned char)c2]++;
For example the ASCII value for 'a' is 97 (see ASCII Table and Description). So if you read an 'a' and increment
freq1['a']++;
that is the same as incrementing:
freq1[97]++;
When you are done with your read loop, you simply need to iterate over your frequency arrays from 'a' to 'z' and the number of times the corresponding character appeared in your file will be captured in your array. Then you can use the data however you like.
Outputting The Results
The simplest way to output your column1/column2 results is simply to output the number of occurrences for each character. For example:
for (int i = 'a'; i <= 'z'; i++) /* loop over 'a' to 'z' */
printf (" %c: %d, %d\n", i, freq1[i], freq2[i]);
Which will produce output similar to:
$ ./bin/freq_dual_col2 <dat/char2col.txt
lowercase occurrence:
a: 1, 1
b: 1, 0
c: 1, 1
d: 1, 0
e: 1, 1
f: 1, 0
...
If you wanted to get a little more verbose and note whether the characters appears "none", or 1 or whether the character was duplicated "dupe", you could employ a few additional checks, e.g.
for (int i = 'a'; i <= 'z'; i++) { /* loop over 'a' to 'z' */
if (freq1[i] == 1) /* check col 1 chars */
printf (" %c , ", i);
else if (!freq1[i])
fputs ("none, ", stdout);
else
fputs ("dupe, ", stdout);
if (freq2[i] == 1) /* check col 2 chars */
printf (" %c\n", i);
else if (!freq2[i])
fputs ("none\n", stdout);
else
fputs ("dupe\n", stdout);
}
Which would produce output as:
$ ./bin/freq_single_dual_col <dat/char2col.txt
lowercase single occurrence, none or dupe:
a , a
b , none
c , c
d , none
e , e
f , none
...
Putting it altogether, your minimal example using fscanf for your read could be similar to:
#include <stdio.h>
#include <limits.h>
#define MAXC UCHAR_MAX+1
int main (int argc, char **argv) {
char c1, c2; /* characters from each line */
int freq1[MAXC] = {0}, freq2[MAXC] = {0}; /* frequency arrays */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (fscanf (fp, " %c,%c", &c1, &c2) == 2) /* read all chars */
if (c1 > 0 || c2 > 0) /* validate ASCII values */
/* increment each element */
freq1[(unsigned char)c1]++, freq2[(unsigned char)c2]++;
if (fp != stdin) fclose (fp); /* close file if not stdin */
puts ("lowercase occurrence:\n");
for (int i = 'a'; i <= 'z'; i++) /* loop over 'a' to 'z' */
printf (" %c: %d, %d\n", i, freq1[i], freq2[i]);
return 0;
}
The example using fgets and sscanf would be similar to:
#include <stdio.h>
#include <limits.h>
#define MAXC UCHAR_MAX+1
int main (int argc, char **argv) {
char c1, c2, /* characters from each line */
buf[MAXC] = ""; /* buffer to hold each line */
int freq1[MAXC] = {0}, freq2[MAXC] = {0}; /* frequency arrays */
/* use filename provided as 1st argument (stdin by default) */
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
perror ("file open failed");
return 1;
}
while (fgets (buf, MAXC, fp)) /* read each line */
if (sscanf (buf, " %c, %c", &c1, &c2) == 2) { /* parse values */
if (c1 > 0 || c2 > 0) /* validate ASCII values */
/* increment each element */
freq1[(unsigned char)c1]++, freq2[(unsigned char)c2]++;
}
else
fputs ("error: in line format.\n", stderr);
if (fp != stdin) fclose (fp); /* close file if not stdin */
puts ("lowercase occurrence:\n");
for (int i = 'a'; i <= 'z'; i++) /* loop over 'a' to 'z' */
printf (" %c: %d, %d\n", i, freq1[i], freq2[i]);
return 0;
}
And if you wanted the more verbose output, then I leave it to you to incorporate it in the code above.
Look things over and let me know if your have further questions.
Add all columns to Sets and check if the Sets are same size of file lines.
remember that Sets ignore duplicates

C program to copy .csv of integers copies one less element unless element size is set to +1

I'm new to learning the C language and I wanted to write a simple program that would copy an array integers from one .csv file to a new .csv file. My code works as intended, however when my array size for fread/fwrite is set to the exact number of elements in the .csv array (10 in this case), it only copies nine of the elements.
When the array size is set to +1, it copies all the elements.
#include <stdio.h>
#include <stdlib.h>
#define LISTSIZE 11
//program that copies an array of integers from one .csv to another .csv
int main(int argc, char * argv[])
{
if (argc != 2)
{
fprintf(stderr, "Usage ./file_sort file.csv\n");
return 1;
}
char * csvfile = argv[1];
FILE * input_csvile = fopen(csvfile, "r"); //open .csv file and create file pointer input_csvile
if(input_csvile == NULL)
{
fprintf(stderr, "Error, Could not open\n");
return 2;
}
unsigned int giving_total[LISTSIZE];
if(input_csvile != NULL) //after file opens, read array from .csv input file
{
fread(giving_total, sizeof(int), LISTSIZE, input_csvile);
}
else
fprintf(stderr, "Error\n");
FILE * printed_file = fopen("school_currentfy1.csv", "w");
if (printed_file != NULL)
{
fwrite(giving_total, sizeof(int), LISTSIZE, printed_file); //copy array of LISTSIZE integers to new file
}
else
fprintf(stderr, "Error\n");
fclose(printed_file);
fclose(input_csvile);
return 0;
}
Does this have something to do with the array being 0-indexed and the .csv file being 1-indexed? I also had an output with the LISTSIZE of 11 which had the last (10) element being displayed incorrectly; 480 instead of 4800.
http://imgur.com/lLOozrc Output/input with LISTSIZE of 10
http://imgur.com/IZPGwsA Input/Output with LISTSIZE of 11
Note: as noted in the comment, fread and fwrite are for reading and writing binary data, not text. If you are dealing with a .csv (comma separated values -- e.g. as exported from MS Excel or Open/LibreOffice calc) You will need to use fgets (or any other character/string oriented function) followed by sscanf (or strtol, strtoul) to read the values as text and perform the conversion to int values. To write the values to your output file, use fprintf. (fscanf is also available for input text processing and conversion, but you lose flexibility in handling variations in input format)
However, if your goal was to read binary data for 10 integers (e.g. 40-bytes of data), then fread and fwrite are fine, but as with all input/output routines, you need to validate the number of bytes read and written to insure you are dealing with valid data within your code. (and that you have a valid output data file when you are done)
There are many ways to read a .csv file, depending on the format. One generic way is to simply read each line of text with fgets and then repeatedly call sscanf to convert each value. (this has a number of advantages in handling different spacing around the ',' compared to fscanf) You simply read each line, assign a pointer to the beginning of the buffer read by fgets, and then call sscanf (with %n to return the number of character processed by each call) and then advance the pointer by that number and scan forward in the buffer until your next '-' (for negative values) or a digit is encountered. (using %n and scanning forward can allow fscanf to be used in a similar manner) For example:
/* read each line until LISTSIZE integers read or EOF */
while (numread < LISTSIZE && fgets (buf, MAXC, fp)) {
int nchars = 0; /* number of characters processed by sscanf */
char *p = buf; /* pointer to line */
/* (you should check a whole line is read here) */
/* while chars remain in buf, less than LISTSIZE ints read
* and a valid conversion to int perfomed by sscanf, update p
* to point to start of next number.
*/
while (*p && numread < LISTSIZE &&
sscanf (p, "%d%n", &giving_total[numread], &nchars) == 1) {
numread++; /* increment the number read */
p += nchars; /* move p nchars forward in buf */
/* find next digit in buf */
while (*p && *p != '-' && (*p < '0' || *p > '9'))
p++;
}
}
Now to create your output file, you simply write numread values back out in comma separated value format. (you can adjust how many your write per line as required)
for (i = 0; i < numread; i++) /* write in csv format */
fprintf (fp, i ? ",%d" : "%d", giving_total[i]);
fputc ('\n', fp); /* tidy up -- make sure file ends with '\n' */
Then it is just a matter of closing your output file and checking for any stream errors (always check on close when writing values to a file)
if (fclose (fp)) /* always validate close after write to */
perror("error"); /* validate no stream errors occurred */
Putting it altogether, you could do something similar to the following:
#include <stdio.h>
#include <stdlib.h>
#define LISTSIZE 10
#define MAXC 256
int main(int argc, char *argv[])
{
if (argc < 3) {
fprintf(stderr, "Usage ./file_sort file.csv [outfile]\n");
return 1;
}
int giving_total[LISTSIZE]; /* change to int to handle negative values */
size_t i, numread = 0; /* generic i and number of integers read */
char *csvfile = argv[1],
buf[MAXC] = ""; /* buffer to hold MAXC chars of text */
FILE *fp = fopen (csvfile, "r");
if (fp == NULL) { /* validate csvfile open for reading */
fprintf(stderr, "Error, Could not open input file.\n");
return 2;
}
/* read each line until LISTSIZE integers read or EOF */
while (numread < LISTSIZE && fgets (buf, MAXC, fp)) {
int nchars = 0; /* number of characters processed by sscanf */
char *p = buf; /* pointer to line */
/* (you should check a whole line is read here) */
/* while chars remain in buf, less than LISTSIZE ints read
* and a valid conversion to int perfomed by sscanf, update p
* to point to start of next number.
*/
while (*p && numread < LISTSIZE &&
sscanf (p, "%d%n", &giving_total[numread], &nchars) == 1) {
numread++; /* increment the number read */
p += nchars; /* move p nchars forward in buf */
/* find next digit in buf */
while (*p && *p != '-' && (*p < '0' || *p > '9'))
p++;
}
}
if (numread < LISTSIZE) /* warn if less than LISTSIZE integers read */
fprintf (stderr, "Warning: only '%zu' integers read from file", numread);
fclose (fp); /* close input file */
fp = fopen (argc > 2 ? argv[2] : "outfile.csv", "w"); /* open output file */
if (fp == NULL) { /* validate output file open for writing */
fprintf(stderr, "Error, Could not open output file.\n");
return 3;
}
for (i = 0; i < numread; i++) /* write in csv format */
fprintf (fp, i ? ",%d" : "%d", giving_total[i]);
fputc ('\n', fp); /* tidy up -- make sure file ends with '\n' */
if (fclose (fp)) /* always validate close after write to */
perror("error"); /* validate no stream errors occurred */
return 0;
}
Like I said, there are many, many ways to approach this. The idea is to build in as much flexibility to your read as possible so it can handle any variations in the input format without choking. Another very robust way to approach the read is using strtol (or strtoul for unsigned values). Both allow will advance a pointer for you to the next character following the integer converted so you can start your scan for the next digit from there.
An example of the read flexibility provide in either of these approaches is shown below. Reading a file of any number of lines, with values separate by any separator and converting each integer encountered to a value in your array, e.g.
Example Input
$ cat ../dat/10int.csv
8572, -2213, 6434, 16330, 3034
12346, 4855, 16985, 11250, 1495
Example Program Use
$ ./bin/fgetscsv ../dat/10int.csv dat/outfile.csv
Example Output File
$ cat dat/outfile.csv
8572,-2213,6434,16330,3034,12346,4855,16985,11250,1495
Look things over and let me know if you have questions. If your intent was to read 40-bytes in binary form, just let me know and I'm happy to help with an example there.
If you want a truly generic read of values in a file, you can tweak the code that finds the number in the input file to scan forward in the file and validate that any '-' is followed by a digit. This allows reading any format and simply picking the integers from the file. For example with the following minor change:
while (*p && numread < LISTSIZE) {
if (sscanf (p, "%d%n", &giving_total[numread], &nchars) == 1)
numread++; /* increment the number read */
p += nchars; /* move p nchars forward in buf */
/* find next number in buf */
for (; *p; p++) {
if (*p >= '0' && *p <= '9') /* positive value */
break;
if (*p == '-' && *(p+1) >= '0' && *(p+1) <= '9') /* negative */
break;
}
}
You can easily process the following file and obtain the same results:
$ cat ../dat/10intmess.txt
8572,;a -2213,;--a 6434,;
a- 16330,;a
- The Quick
Brown%3034 Fox
12346Jumps Over
A
4855,;*;Lazy 16985/,;a
Dog.
11250
1495
Example Program Use
$ ./bin/fgetscsv ../dat/10intmess.txt dat/outfile2.csv
Example Output File
$ cat dat/outfile2.csv
8572,-2213,6434,16330,3034,12346,4855,16985,11250,1495

Resources