I'm writing a program that compares two files character by character. The function to compare each file returns a value dependant on the condition of the files.
the function returns 0 when both files are the same, -1 if both files match but the first file ends before the second, -2 if both files match but the second file ends before the first, and a positive int indicating which character the files differ at.
#include <stdio.h>
#include <string.h>
#define CMP_EQUAL 0
#define CMP_EOF_FIRST -1
#define CMP_EOF_SECOND -2
int char_cmp(FILE *fp1, FILE *fp2);
int main(void)
{
FILE *fp1;
FILE *fp2;
fp1 = fopen("input1.txt", "rb+");
fp2 = fopen("input2.txt", "rb+");
switch(char_cmp(fp1, fp2))
{
case CMP_EQUAL:
printf("The Files are equal");
break;
case CMP_EOF_FIRST:
printf("EOF on a.txt");
break;
case CMP_EOF_SECOND:
printf("EOF on t.txt");
break;
default:
printf("files differ: char %d\n", char_cmp(fp1, fp2));
break;
}
if(fclose(fp1) != 0)
{
perror("fclose");
/*other error handling*/
}
if(fclose(fp2) != 0)
{
perror("fclose");
/*other error handling*/
}
return 0;
}
int char_cmp(FILE *fp1, FILE *fp2)
{
int c, d;
size_t byte = 0;
int same = 1;
do
{
byte++;
}while((c = fgetc(fp1)) == (d = fgetc(fp2)));
if(c == EOF && d != EOF)
{
return CMP_EOF_FIRST;
}
if(d == EOF && c != EOF)
{
return CMP_EOF_SECOND;
}
if(c != d)
{
return byte;
}
return CMP_EQUAL;
}
I was wondering how i would break out of the do loop after checking if all the characters match in each file. Because when i have tried, It breaks the moment it finds a character that is the same and does not check the rest of it.
Also i've encourtered this weird bug where if one file contains:
dee
and the second one contains
ae
it gives me a weird return value and was wondering why is that so?
thanks for any help in advance
You call char_cmp(fp1, fp2)) twice - once in the switch statement, and the second time in the default condition. The second time it returns you the second char position in which they differ (or something another, really unexpected :)
Change it to
int k = char_cmp(fp1, fp2));
and use k in these both places:
switch( k )
...
printf("files differ: char %d\n", k);
EDIT: The infinite loop in case of equal files happens because in this condition:
(c = fgetc(fp1)) == (d = fgetc(fp2))
c and d become forever equal to EOF from some moment. Change it to
(c = fgetc(fp1)) == (d = fgetc(fp2) && c != EOF
and everything is ok.
You are calling char_cmp() multiple times. The second time round, in the printf() call, returns a different value from the first call because the file pointers have been used.
Call char_cmp() once and store the returned value in a local.
cmp = char_cmp(fp1, fp2);
switch(cmp)
{
case CMP_EQUAL:
printf("The Files are equal");
break;
case CMP_EOF_FIRST:
printf("EOF on a.txt");
break;
case CMP_EOF_SECOND:
printf("EOF on t.txt");
break;
default:
printf("files differ: char %d\n", cmp);
break;
}
I don't know whether the rest of your logical is correct or not.
Actually, your logic is not correct. It enters an infinite loop when presented with identical files. I'm sure you'll be able to track down the problem!
When both reach EOF at the same time, the while condition is true and you start looping over and over, since EOF == EOF.
I suggest you to try to be less "short" at the beginning.
Related
I am supposed to be "fixing" code given to me to make it display the correct number of visible characters in a file (spaces too). The correct number is supposed to be 977. I have never dealt with files before and I don't understand what I need to do to display the correct number.
* Driver Menu System for Homework
* Andrew Potter - Mar 5, 2019 <-- Please put your name/date here
*/
#include <stdio.h>//header file for input/output -
#include <stdlib.h>
#include <ctype.h>
// since you will place all your assigned functions (programs) in this file, you do not need to include stdio.h again!
int menu(void); //prototype definition section
void hello(void);
void countall(void);
int main(void)
{
int selection = menu();
while(selection != 99) {
switch(selection) {
case 1:
hello();
break;
case 2:
countall();
break;
case 3:
break;
case 4:
break;
default:
printf("Please enter a valid selection.\n");
}
selection = menu();
}
return 0;
}
int menu(void) {
int choice;
printf("***************************\n");
printf(" 1. Hello \n");
printf(" 2. Countall\n");
printf(" 3. \n");
printf(" 4. \n");
printf("99. Exit\n");
printf("Please select number and press enter:\n");
printf("***************************\n");
scanf("%d", &choice);
getchar();
return choice;
}
void hello(void) {
printf("Hello, World!!!\n");
}
//*****Andrew 5/1/19*****
#define SLEN 81 /* from reverse.c */
/* original header: int count(argc, *argv[]) */
void countall(void)
{
int ch; // place to store each character as read
FILE *fp; // "file pointer"
long unsigned count = 0;
char file[SLEN]; /* from reverse.c */
/*Checks whether a file name was included when run from the command prompt
* The argument count includes the program file name. A count of 2 indicates
* that an additional parameter was passed
if (argc != 2)
{
printf("Usage: %s filename\n", argv[0]);
exit(EXIT_FAILURE);
}
* The following uses the second parameter as the file name
* and attempts to open the file
if ((fp = fopen(argv[1], "r")) == NULL)
{
printf("Can't open %s\n", argv[1]);
exit(EXIT_FAILURE);
} */
/*************************************
Code from reverse.c included to make the program work from within our IDE
*************************************/
puts("Enter the name of the file to be processed:");
scanf("%s", file);
if ((fp = fopen(file,"rb")) == NULL) /* read mode */
{
printf("count program can't open %s\n", file);
exit(EXIT_FAILURE);
}
/* EOF reached when C realizes it tried to reach beyond the end of the file! */
/* This is good design - see page 573 */
while ((ch = getc(fp)) != EOF)
{
if (isprint(ch)) {
count++;
}
else if (isprint(ch)) {
count++;
}
putc(ch,stdout); // same as putchar(ch);
count++;
}
fclose(fp);
printf("\nFile %s has %lu characters\n", file, count);
}
I expected I would get the correct number of visible characters using the combination of isprint and isspace but I usually get 2086.
The assignment directions are: "Word identifies 977 characters including spaces. Your current countall() believes there are 1043. Make the corrections necessary to your code to count only the visible characters and spaces! (Hint: check out 567 in your textbook.)" Before I edited any code the count was 1043, now i am getting 2020. I need 977.
isprint() returns a Boolean result - zero if the character is not "printable", and non-zero if it is. As such isprint(ch) != '\n'makes no sense. Your complete expression in the question makes even less sense, but I'll come on to that at the end.
isprint() on its own returns true (non-zero) for all printable characters, so you need no other tests. Moreover you increment count unconditionally and in every conditional block, so you are counting every character and some twice.
You just need:
if( isprint(ch) )
{
count++;
}
putc( ch, stdout ) ;
While your code is clearly an incomplete fragment, it is not clear where or how your are reading ch. You need a getc() or equivalent in there somewhare.
while( (ch = getc(fp)) != EOF )
{
if( isprint(ch) )
{
count++;
}
putc( ch, stdout ) ;
}
It is not clear whether you need to count all whitespace (including space, tab and newline) or just "spaces" as you stated. If so be clear that isprint() will match space, but not control characters newline or tab. isspace() matches all these, but should not be counted separately to isprint() because 'space' is in both white-space and printable sets. If newline and tab are to be counted (and less likely; "vertical tab") then:
while( (ch = getc(fp)) != EOF )
{
if( isprint(ch) || isspace(ch) )
{
count++;
}
putc( ch, stdout ) ;
}
Another aspect of C that you seem to misunderstand is how Boolean expressions work. To test a single variable for multiple values you must write:
if( var == x || var == y || var == z )
You have written:
if( var == x || y || z )
which may make sense in English (or other natural language) when you read it out aloud, but in C it means:
if( var == (x || y || z ) )
evaluating (x || y || z ) as either true or false and comparing it to var.
It is probably worth considering the semantics of your existing solution to show why it actually compiles, but produces the erroneous result it does.
Firstly,
isprint(ch) != '\n' || '\t' || '\0'
is equivalent to isprint(ch) != true, for the reasons described earlier. So you increment the counter for all characters that are not printable.
Then here:
isspace(ch) == NULL
NULL is a macro representing an invalid pointer, and isspace() does not return a pointer. However NULL will implicitly cast to zero (or false). So here you increment the counter for all printable characters that are not spaces.
Finally, you unconditionally count every character here:
putc(ch,stdout); // same as putchar(ch);
count++;
So your result will be:
number-of-non-printing-characters +
number-of-printing-characters - number-of-spaces +
total-number-of-characters
which is I think (2 x file-length) - number-of-spaces
Finally note that if you open a text file that has CR+LF line ends (conventional for text files on Windows) in "binary" mode, isspace() will count two characters for every new-line. Be sure to open in "text" mode (regardless of the platform).
From isprint():
A printable character is a character that occupies a printing position on a display (this is the opposite of a control character, checked with iscntrl).
and
A value different from zero (i.e., true) if indeed c is a printable character. Zero (i.e., false) otherwise.
So that function should be sufficient. Please note that you have to make sure to feed all these is...() functions from <ctype.h> unsigned values. So if you use it with a value of uncertain origin, better cast to char unsigned.
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
int main(void)
{
char const *filename = "test.txt";
FILE *input = fopen(filename, "r");
if (!input) {
fprintf(stderr, "Couldn't open \"%s\" for reading. :(\n\n", filename);
return EXIT_FAILURE;
}
long long unsigned count = 0;
for (int ch; (ch = fgetc(input)) != EOF;) {
if (isprint(ch))
++count;
}
fclose(input);
printf("Count: %llu\n\n", count);
}
If I wasn't lucky enough to guess which characters you want to be counted, have a look at ctype.h, there is a table.
if ((ch == '\t') || isprint(ch))
count++;
If you want to handle tabs differently (maybe to count how many spaces they use):
if (ch == '\t') {
/* Do smth */
} else if (isprint(ch)) {
count++;
}
This should be enough.
I'm currently doing an assignment where we are to recreate three switches of the cat command, -n/-T/-E. We are to compile and enter in two parameters, the switch and the file name. I store the textfile contents into a buffer.
int main(int argc, char *argv[]){
int index = 0;
int number = 1;
int fd, n, e, t;
n = e = t = 0;
char command[5];
char buffer[BUFFERSIZE];
strcpy(command, argv[1]);
fd = open(argv[2], O_RDONLY);
if( fd == -1)
{
perror(argv[2]);
exit(1);
}
read(fd, buffer,BUFFERSIZE);
if( !strcmp("cat", command)){
printf("%s\n", buffer);
}
else if( !strcmp("-n", command)){
n = 1;
}
else if( !strcmp("-E", command)){
e = 1;
}
else if( !strcmp("-T", command)){
t = 1;
}
else if( !strcmp("-nE", command) || !strcmp("-En", command)){
n = e = 1;
}
else if( !strcmp("-nT", command) || !strcmp("-Tn", command)){
n = t = 1;
}
else if( !strcmp("-ET", command) || !strcmp("-TE", command)){
t = e = 1;
}
else if( !strcmp("-nET", command) || !strcmp("-nTE", command) ||
!strcmp("-TnE", command) || !strcmp("-EnT", command) ||
!strcmp("-ETn", command) || !strcmp("-TEn", command)){
n = e = t = 1;
}
else{
printf("Invalid Switch Entry");
}
if(n){
printf("%d ", number++);
}
while(buffer[index++] != '\0' && ( n || e || t)){
if(buffer[index] == '\n' && e && n){
printf("$\n%d ", number++);
}
else if(buffer[index] == '\n' && e){
printf("$\n");
}
else if(buffer[index] == '\t' && t){
printf("^I");
}
else if(buffer[index] == '\n' && n){
printf("\n%d ", number++);
}
else {
printf("%c", buffer[index]);
}
}
printf("\n");
close(fd);
return 0;
}
Everything works perfectly except when I try to use the -n command. It adds an extra new line. I use a textfile that has
hello
hello
hello world!
instead of
1 hello
2 hello
3 hello world!
it will print out this:
1 hello
2 hello
3 hello world!
4
For some reason it adds the extra line after the world!
Am I missing something simple?
This might not fix your problem, but I don't see any code to put the terminating null character in buffer. Try:
// Reserve one character for the null terminator.
ssize_t n = read(fd, buffer, BUFFERSIZE-1);
if ( n == -1 )
{
// Deal with error.
printf("Unable to read the contents of the file.\n");
exit(1); //???
}
buffer[n] = '\0';
The three cat options that you implement have different "modes":
-T replaces a character (no tab is written);
-E prepends a character with additional output (the new-line character is still written);
-n prepends each line with additional output.
You can handle the first two modes directly. The third mode requires information from the character before: A new line starts at the start of the file and after a new-line character has been read. So you need a flag to keep track of that.
(Your code prints a line number after a new-line character is found. That means that you have to treat the first line explicitly and that you get one too many line umber at the end. After all, a file with n lines has n new-line characters and you print n + 1 line numbers.)
Other issues:
As R Sahu has pointed out, your input isn't null-terminated. You don't really need a null terminator here: read returns the number of bytes read or an error code. You can use that number as limit for index.
You incmenet index in the while condition, which means that you look at the character after the one you checked inside the loop, which might well be the null character. You will also miss the first character in the file.
In fact, you don't need a buffer here. When the file is larger than you buffer, you truncate it. You could call read in a loop until you read fewer bytes than BUFFERSIZE, but the simplest way in this case is to read one byte after the other and process it.
You use too many compound conditions. This isn't wrong per se, but it makes for complicated code. Your main loop reads like a big switch when there are in fact only a few special cases to treat.
The way you determine the flags is both too complicated and too restricted. You chack all combinations of flags, which is 6 for the case that all flags are given. What if you add another flag? Are you going to write 24 more strcmps? Look for the minus sign as first character and then at the letters one by one, setting flags and printing error messages as you go.
You don't need to copy argv[1] to command; you are only inspecting it. And you are introducing a source of error: If the second argument is longer than 4 characters, you will get undefined behaviour, very likely a crash.
If you don't give any options, the file name should be argv[1] instead of argv[2].
Putting this (sans the flag parsing) into practice:
FILE *f = fopen(argv[2], "r");
int newline = 1; // marker for line numbers
// Error checking
for (;;)
{
int c = fgetc(f); // read one character
if (c == EOF) break; // terminate loop on end of file
if (newline) {
if (n) printf("%5d ", number++);
newline = 0;
}
if (c == '\n') {
newline = 1;
if (e) putchar('$');
}
if (c == '\t' && t) {
putchar('^');
putchar('I');
} else {
putchar(c);
}
}
fclose(f);
Edit: If you are restricted to using the Unix open, close and read, you can still use the approach above. You need an additional loop that reads blocks of a certain size with read. The read function returns the value of the bytes read. If that is less than the number of bytes asked for, stop the loop.
The example below adds yet an additional loop that allows to concatenate several files.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#define BUFFERSIZE 0x400
int main(int argc, char *argv[])
{
int n = 0;
int e = 0;
int t = 0;
int number = 0;
int first = 1;
while (first < argc && *argv[first] == '-') {
char *str = argv[first] + 1;
while (*str) {
switch (*str) {
case 'n': n = 1; break;
case 'E': e = 1; break;
case 'T': t = 1; break;
default: fprintf(stderr, "Unknown switch -%c.\n", *str);
exit(0);
}
str++;
}
first++;
}
while (first < argc) {
int fd = open(argv[first], O_RDONLY);
int newline = 1;
int bytes;
if (fd == -1) {
fprintf(stderr, "Could not open %s.\n", argv[first]);
exit(1);
}
do {
char buffer[BUFFERSIZE];
int i;
bytes = read(fd, buffer,BUFFERSIZE);
for (i = 0; i < bytes; i++) {
int c = buffer[i];
if (newline) {
if (n) printf("%5d ", number++);
newline = 0;
}
if (c == '\n') {
newline = 1;
if (e) putchar('$');
}
if (c == '\t' && t) {
putchar('^');
putchar('I');
} else {
putchar(c);
}
}
} while (bytes == BUFFERSIZE);
close(fd);
first++;
}
return 0;
}
I am in the process of writing an encryption program but have come to a halt because of the error: 'Segmentation Fault(Core Dumped)'. The below program is suppose to print from two input files:
The first input file: should be read in and then change the upper case characters to lower case characters and lower case characters to upper case characters.
The second input file: should be read in and just print the number of times the user desired character appears in the file. In this case the user desired character I want is the letter 'a'.
Let's say for example the first input file(input.txt) contains:
Hello My Name is Joe
This should print as: hELLO mY nAME IS jOE
Let's say for example the second input file(keys.txt) contains:
A
M
This should just print the character: A
NOTE This doesn't necessarily encrypt the input file yet however, i'm trying to get familiar with using more than one input file at once. I can use all the help I can get! THANK YOU!
ALSO, when compiling, the code should look like this:
gcc myProgram.c
./a.out e input.txt keys.txt
(The above 'e' just stands for encryption.)
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main(int args, char *argc[]){
int i,c,x,len,len2;
char str[1024];
char str2[500];
FILE *finp;
FILE *keyFile;
/* ****** CODE TO ENCRYPT STARTS HERE ****** */
if((argc[1]="e")&&((finp = fopen(argc[2],"r"))==NULL)
&&((keyFile=fopen(argc[3],"r"))==NULL)){
printf("Could Not Open file %s\n", argc[2]);
exit(1);
}//End First IF statement
/* *** START CODE TO GRAB FROM 1st INPUT FILE: input.txt *** */
/*Grab strings from first input file and change lower case to upper case and
upper to lower case*/
while(fgets(str,1024,finp)!=NULL){
len = strlen(str);
for(i>0;i<len;++i){
if(((str[i]>=64)&&(str[i]<=90))||((str[i]>=97&&(str[i]<=122))))
str[i]^=32;}}
/* *** END OF CODE FOR 1st INPUT FILE **** */
/* *** START CODE TO GRAB FROM 2nd INPUT FILE: keys.txt **** */
/*Grab character from second input file and print the character*/
while(fgets(str2,500,keyFile)!=NULL){
len2 = strlen(str2);
for(x>0;x<len2;++x){
if(str2[x]=='A'){
putchar(str2[x]);
}}
/* ***** END CODE FOR 2nd INPUT FILE*** */
}
printf("%s\n",str);
fclose(finp);
return 0;}
I think the main problem in your code is that you have not initialized i and x before using them.
Replace the line
for(i>0;i<len;++i){
with
for(i=0;i<len;++i){
// ^^^ i = 0; not i > 0;
and replace the line
for(x>0;x<len2;++x){
with
for(x=0;x<len2;++x){
// ^^^ x = 0; not x > 0;
You can clean up the code at the start of the function. The logic used in
if((argc[1]="e")&&((finp = fopen(argc[2],"r"))==NULL)
&&((keyFile=fopen(argc[3],"r"))==NULL)){
printf("Could Not Open file %s\n", argc[2]);
exit(1);
}//End First IF statement
is wrong on many accounts. Replace that with more readable code:
if ( strcmp(argv[1], "e") == 0 )
{
if ( (finp = fopen(argc[2],"r")) == NULL )
{
printf("Could Not Open file %s\n", argc[2]);
exit(1);
}
if ( (keyFile = fopen(argc[3],"r")) == NULL )
{
printf("Could Not Open file %s\n", argc[3]);
exit(1);
}
}
else
{
// Decide what you want to do when the first argument is not "e".
}
in your code
(argc[1]="e")
should be
!strcmp(argv[1], "e")
Same kind of mistake for argc[2] and argc[3].
Remember, argc is of type int (not array). argv[] is of type char *.
That said, you should always check for the argc value against n befire making use of argv[n-1].
Then, please keep in mind, the second operand of && is evaluated only if the first operand yields a TRUE value. You shoule check the logic you're using in
if((argc[1]="e")&&((finp = fopen(argc[2],"r"))==NULL)
&&((keyFile=fopen(argc[3],"r"))==NULL))
I don't think it serves the purpose you want it to serve.
Also, as pointed out by Mr. #iharob, you never seem to initialize the counter variables used in either of your for loops. This will lead to undefined behavior.
You have many many errors
The first if statement, is completely wrong
The argc[1] = "e", is wrong from many points of view, first you cant compre strings with the == operator, but you didn't use the comparison operator, it's the assignment operator and you can't assign to arrays, so it's twofold wrong.
You used the && operator to check if both files where NULL at the same time, which would be false if only one of them is, making the code that follows invoke undefined behavior, possibly causing the SEGMENTATION FAULT.
You never check if the program was invoked with the correct number of arguments, but you still access the argc array, which by the way is normally argv, argc is used for the number of parameters i.e. where you used args, but that doesn't actually matter.
Your for loops are also wrong
for (i > 0 ... )
you never initialize the i, also a little bit of knowledge of how strings work in c, would make a c programmer to write the following loop to traverse the string
for (i = 0 ; ((str[i] != '\n') && (str[i] != '\0')) ; ++i)
since fgets() will read the trailing '\n' inserted by pressing the Return/Enter key and thus flushing standard input, you need to check against str[i] == '\n' but if you are paranoid you should also check for '\0' I am paranoid and I do check although that is ineficient, I prefer to do it instead of seeing unexpected things later.
Here is a sample of your program without the mistakes, I don't know if it does what you want but it's the same program just with errors that could lead to SEGMENTATION FAULT corrected
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main(int argc, char *argb[])
{
int i;
char str[1024];
char str2[500];
FILE *inputFile;
FILE *keyFile;
if (argc < 3) /* insufficient number of parameters provided */
return -1;
if (argb[1][0] == 'e')
return 0;
inputFile = fopen(argb[2], "r");
if (inputFile == NULL)
{
printf("Could Not Open file %s\n", argb[2]);
return -1;
}
keyFile = fopen(argb[3], "r");
if (keyFile == NULL)
{
printf("Could Not Open file %s\n", argb[2]);
fclose(inputFile);
return -1;
}
while (fgets(str, sizeof(str), inputFile) != NULL)
{
for (i = 0 ; ((str[i] != '\n') && (str[i] != '\0')) ; ++i)
{
if (((str[i] >= 64) && (str[i] <= 90)) || ((str[i] >= 97) && (str[i]<=122)))
str[i] ^= 32;
}
}
while (fgets(str2, sizeof(str2), keyFile) != NULL)
{
for (i = 0 ; ((str2[i] != '\n') && (str2[i] != '\0')) ; ++i)
{
if (str2[i] == 'A')
putchar(str2[i]);
}
}
printf("%s\n", str);
fclose(inputFile);
return 0;
}
I use getc(); in a C excercise, and after looking back on the program I noticed something weird. I assumed that the file given on the command line arguments contains at least one byte. (It calls getc(); twice in a row without checking for EOF. After trying it on an empty file it still worked smoothly. My question is: is the behaviour of getc(); on a file pointer that's been exhausted (EOF has been reached and not rewinded) undefined or will it always continue to return EOF?
I think I could expand this question to all the I/O functions in the C STL, please clarify this in your answer too.
Here is the code for the program. The program is supposed to strip a C/C++ source file from all comments (and it works perfectly).
#include <stdio.h>
int main(int argc, char *argv[]) {
int state = 0; // state: 0 = normal, 1 = in string, 2 = in comment, 3 = in block comment
int ignchar = 0; // number of characters to ignore
int cur, next; // current character and next one
FILE *fp; // input file
if (argc == 1) {
fprintf(stderr, "Usage: %s file.c\n", argv[0]);
return 1;
}
if ((fp = fopen(argv[1], "r")) == NULL) {
fprintf(stderr, "Error opening file.\n");
return 2;
}
cur = getc(fp); // initialise cur, assumes that the file contains at least one byte
while ((next = getc(fp)) != EOF) {
switch (next) {
case '/':
if (!state && cur == '/') {
state = 2; // start of comment
ignchar = 2; // don't print this nor next char (//)
} else if (state == 3 && cur == '*') {
state = 0; // end of block comment
ignchar = 2; // don't print this nor next char (*/)
}
break;
case '*':
if (!state && cur == '/') {
state = 3; // start of block comment
ignchar = 2; // don't print this nor next char (/*)
}
break;
case '\n':
if (state == 2) {
state = 0;
ignchar = 1; // don't print the current char (cur is still in comment)
}
break;
case '"':
if (state == 0) {
state = 1;
} else if (state == 1) {
state = 0;
}
}
if (state <= 1 && !ignchar) putchar(cur);
if (ignchar) ignchar--;
cur = next;
}
return 0;
}
Stdio files keep an "eof" flag that's set the first time end-of-file is reached and can only be reset by calling clearerr or performing a successful fseek or rewind. Thus, once getc returns EOF once, it will keep returning EOF, even if new data becomes available, unless you use one of the aforementioned methods for clearing the eof flag.
Some non-conformant implementations may immediately make new data available. This behavior is harmful and can break conformant applications.
If the EOF flag on the stream is set, getc should return EOF (and if you keep calling getc, it should keep returning EOF).
Logically, I think it should return EOF forever.
getc is defined in terms of fgetc.
The getc() function shall be equivalent to fgetc() , except that if it
is implemented as a macro it may evaluate stream more than once, so the
argument should never be an expression with side effects.
The documentation for fgetc says:
If the end-of-file indicator for the stream is set, or if the stream
is at end-of-file, the end-of-file indicator for the stream shall be set
and fgetc() shall return EOF.
And "is at end-of-file" can be determined by calling feof.
The documentation for feof says:
The feof() function shall return non-zero if and only if the
end-of-file indicator is set for stream.
So unless something happens to clear the end-of-file indicator, it should continue returning EOF forever.
EDIT:
complete code with main is here http://codepad.org/79aLzj2H
and once again this is were the weird behavious is happening
for (i = 0; i<tab_size; i++)
{
//CORRECT OUTPUT
printf("%s\n", tableau[i].capitale);
printf("%s\n", tableau[i].pays);
printf("%s\n", tableau[i].commentaire);
//WRONG OUTPUT
//printf("%s --- %s --- %s |\n", tableau[i].capitale, tableau[i].pays, tableau[i].commentaire);
}
I have an array of the following strcuture
struct T_info
{
char capitale[255];
char pays[255];
char commentaire[255];
};
struct T_info *tableau;
This is how the array is populated
int advance(FILE *f)
{
char c;
c = getc(f);
if(c == '\n')
return 0;
while(c != EOF && (c == ' ' || c == '\t'))
{
c = getc(f);
}
return fseek(f, -1, SEEK_CUR);
}
int get_word(FILE *f, char * buffer)
{
char c;
int count = 0;
int space = 0;
while((c = getc(f)) != EOF)
{
if (c == '\n')
{
buffer[count] = '\0';
return -2;
}
if ((c == ' ' || c == '\t') && space < 1)
{
buffer[count] = c;
count ++;
space++;
}
else
{
if (c != ' ' && c != '\t')
{
buffer[count] = c;
count ++;
space = 0;
}
else /* more than one space*/
{
advance(f);
break;
}
}
}
buffer[count] = '\0';
if(c == EOF)
return -1;
return count;
}
void fill_table(FILE *f,struct T_info *tab)
{
int line = 0, column = 0;
fseek(f, 0, SEEK_SET);
char buffer[MAX_LINE];
char c;
int res;
int i = 0;
while((res = get_word(f, buffer)) != -999)
{
switch(column)
{
case 0:
strcpy(tab[line].capitale, buffer);
column++;
break;
case 1:
strcpy(tab[line].pays, buffer);
column++;
break;
default:
strcpy(tab[line].commentaire, buffer);
column++;
break;
}
/*if I printf each one alone here, everything works ok*/
//last word in line
if (res == -2)
{
if (column == 2)
{
strcpy(tab[line].commentaire, " ");
}
//wrong output here
printf("%s -- %s -- %s\n", tab[line].capitale, tab[line].pays, tab[line].commentaire);
column = 0;
line++;
continue;
}
column = column % 3;
if (column == 0)
{
line++;
}
/*EOF reached*/
if(res == -1)
return;
}
return ;
}
Edit :
trying this
printf("%s -- ", tab[line].capitale);
printf("%s --", tab[line].pays);
printf("%s --\n", tab[line].commentaire);
gives me as result
-- --abi -- Emirats arabes unis
I expect to get
Abu Dhabi -- Emirats arabes unis --
Am I missing something?
Does printf have side effects?
Well, it prints to the screen. That's a side effect. Other than that: no.
is printf changing its parameters
No
I get wrong resutts [...] what is going on?
If by wrong results you mean that the output does not appear when it should, this is probably just a line buffering issue (your second version does not print newline which may cause the output to not be flushed).
It's highly unlikely that printf is your problem. What is far, far more likely is that you're corrupting memory and your strange results from printf are just a symptom.
There are several places I see in your code which might result in reading or writing past the end of an array. It's hard to say which of them might be causing you problems without seeing your input, but here are a few that I noticed:
get_lines_count won't count the last line if it doesn't end in a newline, but your other methods will process that line
advance will skip over a newline if it is preceded by spaces, which will cause your column-based processing to get off, and could result in some of your strings being uninitialized
get_word doesn't do any bounds checks on buffer
There may be others, those were just the ones that popped out at me.
I tested your code, adding the missing parts (MAX_LINE constant, main function and a sample datafile with three columns separated by 2+ whitespace), and the code works as expected.
Perhaps the code you posted is still not complete (fill_table() looks for a -999 magic number from get_word(), but get_word() never returns that), your main function is missing, so we don't know if you are properly allocating memory, etc.
Unrelated but important: it is not recommended (and also not portable) to do relative movements with fseek in text files. You probably want to use ungetc instead in this case. If you really want to move the file pointer while reading a text stream, you should use fgetpos and fsetpos.
Your approach for getting help is very wrong. You assumed that printf had side effects without even understanding your code. The problem is clearly not in printf, but you held information unnecessarily. Your code is not complete. You should create a reduced testcase that compiles and displays your problem clearly, and include it in full in your question. Don't blame random library functions if you don't understand what is really wrong with your program. The problem can be anywhere.
From your comments, i am assuming if you use these printf statements,
printf("%s\n", tableau[i].capitale);
printf("%s", tableau[i].pays);
printf("%s\n", tableau[i].commentaire);
then everything works fine...
So try replacing your single printf statement with this. (Line no. 173 in http://codepad.org/79aLzj2H)
printf("%s\n %s %s /n", tableau[i].capitale, tableau[i].pays, tableau[i].commentaire);