Trouble with characters in C - c

Why does this not compile? Cant see the error
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(void)
{
char *c;
FILE *f = fopen("file.txt", "r");
if(f == NULL) {
printf("Could not open file");
}
while((c = fgetc(f)) != EOF) {
if(strcmp(c, " ") == 0) {
printf(" ");
} else if(strcmp(c, ":") == 0) {
printf(":");
} else if(strcmp(c, "#") == 0) {
printf("#");
} else if(strcmp(c, "\n") == 0) {
printf("\n");
} else {
printf("Not a valid char");
}
}
}

fgetc returns the char currently at the file pointer as an integer.
So char *c; should be int c;
and
if(strcmp(c, " ") == 0) {
should be
if(c == ' ') {
and similarly change other comparisons.
You can compact the comparisons as:
while((c = fgetc(f)) != EOF) {
if(c == ' ' || c == ':' || c == '#' || c == '\n') {
printf("%c",c);
} else {
printf("Not a valid char");
}
}

Because a char * is not a char.
The fgetc function returns a character, not a string. That means that your entire group of if statements is wrong too. You should be doing simple things like:
if( c == ' ' ) {
} else if( c == ':' ) {
} ...

Yes, fgetc() returns int, not char or char*. Why is this important? Because EOF is usually (always?) defined as -1. If fgetc() returns EOF into an 8 bit char, it will be represented as 0xFF. In some character sets this is a valid character. e.g. y-umlaut in ISO-8859-1. Thus using
char c; // << this is wrong, use int
while((c = fgetc(aFile)) == EOF)
{
// stuff
}
you cannot distinguish between end of file and one of the characters that can legitimately appear in the stream.

As mentioned, fgetc returns an int not actually a string (So strcmp will fail). My personally preferred though not really any different method of char comparison is to use a switch, and since you have several printing out the input character you might have something like:
while( (c = fgetc(f)) != EOF ) {
switch( c )
{
case ' ':
case ':':
case '#':
case '\n':
printf( "%c", c );
break;
default:
printf( "Not a valid char" );
}
}
I have found this to be the easiest way, especially when you know you'll want to expand on your conditions later. ( Say if you wanted to add: 'f', 'o', and 'r' )

Related

I need to fix this two problems in the program. Based on the inputs, I need a fix on the code to produce the desired output

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
void handle(FILE *np)// this is to handle newline characters
{
putc('\n', np);
}
/* skip a C multi-line comment, return the last byte read or EOF */
int m_cmnt(FILE *fp, int *lineno_p) {
FILE *np = stdout;
int prev, ch, replacement = ' ';
for (prev = 0; (ch = getc(fp)) != EOF; prev = ch) {
if (prev == '\\' && ch == 'n') {
replacement = '\n';
++*lineno_p;
}
if (prev == '*' && ch == '/')
return replacement;
}
return EOF;
}
int main(int argc, char *argv[]) {
FILE *fp = stdin, *np = stdout;
int ch,prev;
bool String = 0;
const char *filename = "<stdin>";
int lineno = 1;
fp = fopen(filename, "r");
np = fopen(argv[2], "w");
if (argc > 1) {
if ((fp = fopen(filename = argv[1], "r")) == NULL) {
fprintf(stderr, "Cannot open input file %s: \n",
filename);
exit(EXIT_FAILURE);
}
}
if (argc > 2) {
if ((np = fopen(argv[2], "w")) == NULL) {
fprintf(stderr, "Cannot open output file %s: \n",
argv[2]);
exit(EXIT_FAILURE);
}
}
while ((ch = getc(fp)) != EOF) {
if (ch == '\n')
lineno++;
/* file pointer currently not inside a string */
if (!String) {
if (ch == '/') {
ch = getc(fp);
if (ch == '\n')
lineno++;
if (ch == '*') {
int startline = lineno;
ch = m_cmnt(fp, &lineno);
if (ch == EOF) {
fprintf(stderr, "%s:%d: error: unterminated comment started on line %d\n",
filename, lineno, startline);
exit(EXIT_FAILURE);
break;
}
putc(ch, np);
} else {
putc('/', np);
putc(ch, np);
}
}
else if ( ch=='\\')/*to handle newline character*/
{
prev=ch ;
ch= getc(fp) ;
switch(ch)
{
case 'n' :
handle(np);
break ;
/*default :
putc(prev , np) ;
putc(ch , np) ;
break ;*/
}
}
else {
putc(ch, np);
}
} else {
putc(ch, np);
}
if (ch == '"' || ch == '\'')
String = !String;
}
fclose(fp);
fclose(np);
//remove(arr[1]);
//rename("temp.txt", arr[1]);
return EXIT_SUCCESS;
}
I have been working on this project for almost more than a week now. I have asked many questions on this site to help me get the desired result.The basics of this program is to remove multiline comments from source file and write the rest to some output file. It also need to to ignore any thing that is inside a string literal or character literal(like escaped characters). Now I have come to finalize it but I still need to achieve this two outputs shown below
INPUT1 = //*SOMECOMMENT*/
OUTPUT1 = /
INPUT2 = "this \"test"/*test*/
OUTOUT2 = "this \"test"
The current(erroneous) output is shown below
INPUT1 = //*SOMECOMMENT*/
OUTPUT1 = //*SOMECOMMENT*/ This is wrong.
INPUT2 = "this \"test"/*test*/
OUTOUT2 = "this \"test"/*test*/ This is also wrong.
The program don't work for the case where a comment comes after a forward slash(/) and the second failure of the program is it don't ignore escape character inside a string or character literal. I need a fix on this two problems please.
If your problem is that you want to read an input stream of characters, divide that stream into tokens, and then emit only a subset of those tokens, I think Lex is exactly the tool you're looking for.
If I understand your comment correctly, the file you're trying to read in and transform is itself C code. So you will need to build up a Lex definition of the C language rules.
A quick search turned up this Lex specification of the ANSI C grammar. I cannot vouch for its accuracy or speak to its licensing. At first glance it seems to only support C89. But it is probably enough to point you in the right direction.

Making program to stop reading from file if EOF and read only from other

Only problem is that when one file is at EOF, program still writes - or +, just need to make some condition to make it just takes words from one file when other is at EOF. For example
prvy.txt: Ahojte nasi studenti ktori maju radi programovanie
druhy.txt: vsetci mili
treti.txt:
+Ahojte -vsetci +nasi -mili +studenti +ktori +maju +radi +programovanie
#include<stdio.h>
#include<stdlib.h>
int main(){
FILE *first, *second, *third;
char ch[256],ch1[256];
int i=1,count=0, ch2;
char space = ' ';
char minus = '-';
char plus = '+';
first=fopen("prvy.txt", "r");
second=fopen("druhy.txt", "r");
third=fopen("treti.txt", "w");
if(first==NULL || second==NULL || third==NULL)
{
perror("error");
exit(1);
}
while (fscanf(first, "%255s", ch) == 1)
{
count++;
}
while (fscanf(second, "%255s", ch) == 1)
{
count++;
}
printf("%d",count);
rewind(first);
rewind(second);
for(i;i<=count;i++)
{
if(i%2==1)
{
fputc(plus,third);
ch2=fgetc(first);
while(ch2 != EOF && ch2 != ' ' && ch2 != '\n') {
putc(ch2,third);
ch2=fgetc(first);
}
}
else if(i%2==0)
{
fputc(minus,third);
ch2=fgetc(second);
while(ch2 != EOF && ch2 != ' ' && ch2 != '\n') {
putc(ch2,third);
ch2=fgetc(second);
}
}
putc(space,third);
}
fclose(first);
fclose(second);
fclose(third);
return 0;
}
Your code will alternate between the two files. That will not work as the files may contain different number of words.
One solution could be to count the words in one variable per file. Then the loop could be something like:
// count1: number of words in first file
// count2: number of words in second file
while(count1 > 0 || count2 > 0)
{
if (count1 > 0)
{
fputc(plus,third);
ch2=fgetc(first);
while(ch2 != EOF && ch2 != ' ' && ch2 != '\n') {
putc(ch2,third);
ch2=fgetc(first);
}
--count1;
}
if (count2 > 0)
{
fputc(minus,third);
ch2=fgetc(second);
while(ch2 != EOF && ch2 != ' ' && ch2 != '\n') {
putc(ch2,third);
ch2=fgetc(second);
}
--count2;
}
putc(space,third);
}
You don't need to scan both files first to get a count. Instead, create an array of two input files and use an index to toggle between both as you read. When a file is exhausted when its turn has come, scan and print the other one.
That way, you get rid of the need to control the succesful input of two files simultaneously:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *in[2]; // Two alternating input files
FILE *out;
char line[80];
char prefix[] = "+-"; // Alternating signs, +/-
int index = 0; // index to in[] and prefix[]
in[0] = fopen("1.txt", "r");
in[1] = fopen("2.txt", "r");
out = fopen("3.txt", "w");
if (!(in[0] && in[1] && out)) {
perror("fopen");
exit(1);
}
while (fscanf(in[index], "%79s", line) == 1) {
fprintf(out, "%c%s ", prefix[index], line);
index = !index;
}
while (fscanf(in[!index], "%79s", line) == 1) {
fprintf(out, "%c%s ", prefix[!index], line);
}
fclose(in[0]);
fclose(in[1]);
fclose(out);
return 0;
}

Test the space in a string from a file

I am trying to test if the character in a file.txt is a space ' ' or not using this code:
char *Appartient (FILE *f, char *S)
{
int i = 0, nbdechar = 0, nbocc = 0, PosdePremierChar, space = 0;
char c;
while ((c = getc(f)) != EOF) {
PosdePremierChar = ftell(f);
if (c == S[0]) {
nbdechar = 0;
for (i = 1; i < strlen(S); i++) {
c = getc(f);
if (c == S[i]) {
nbdechar++;
}
}
if (nbdechar == strlen(S) - 1) {
nbocc++;
} else {
rewind(f);
fseek(f, PosdePremierChar - 1, SEEK_CUR);
while ((c = getc(f)) != ' ');
}
} else {
while ((c = getc(f)) != ' ') {
space++;
}
}
}
printf("\n Le nb d'occurence est %d", nbocc);
if (nbocc == 0) {
return "false";
} else {
return "true";
}
}
but a weird symbol 'ے' appear like a garbage when I inspect the variable 'c' in my debugger:
What is wrong
Could be the result of converting the end-of-file result from getc(), EOF, (which is standardized to be negative, often -1) to a character.
Note that your loop never terminates if there's no space in the file, since EOF != ' ' and that condition keeps being true after you hit end-of-file for the first time.
Modify your code like this, trace it and you might become enlightened regarding the relation between what getc() returns and how this correlates to chars:
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
int result = EXIT_SUCCESS;
FILE * f = fopen("test.txt", "r");
if (NULL == f)
{
perror("fopen() failed");
result = EXIT_FAILURE;
}
else
{
int result = EOF;
while (EOF != (result = getc(f)))
{
char c = result;
printf("\n%d is 0x%02x is '%c'", result, result, c);
if (' ' == c)
{
printf(" is space ");
}
}
printf("\nread EOF = %d = 0x%x\n", result, result);
fclose(f);
}
return result;
}
You didn't test if f opened, in case it didn't then undefined behavior will happen, check if the file opened
FILE *file;
int chr;
if ((file = fopen("test.txt", "r")) == NULL)
{
fprintf(stderr, "Cannot open `test.txt'\n");
return -1;
}
while (((chr = fgetc(file)) != EOF) && (chr == ' '))
printf("space\n");
You should declare chr of type int, because fgetc() returns an int, as for example EOF requires to be an int and not a char.
Also, debug mode is useful for tracking the values of variables, I bet that it can five you the value in ascii or decimal or hex, as you need if you know how to ask.

Word Searching using fgetc

I am trying to make word search using fgetc. I understand what fgetc does but i am getting seg fault. on running the gdb test, i returns the following. Is there an easier way to implement the search function?? i am new to programming.
thank you for the help.
#0 0x00007ffff7aa4c64 in getc () from /lib64/libc.so.6
#1 0x000000000040070c in main ()
Where am i going wrong?
#include <stdio.h>
#include <stdlib.h>
int isAlpha(char c)
{
if( c >= 'A' && c <='Z' || c >= 'a' && c <='z' || c >= '0' && c <= '9' )
{
return 1;
}
else
{
return 0;
}
}
int CheckFunctionn(int length, int message_counter, char ref_word[], char newmessage[])
{
int newCounter = 0;
int counterSuccess = 0;
while(newCounter < length)
{
if(ref_word[newCounter] == newmessage[newCounter + message_counter])
{
counterSuccess++;
}
newCounter++;
}
if(counterSuccess == length)
{
return 1;
}
else
{
return 0;
}
}
int main(int argc, char *argv[])
{
char message[300];
int counter = 0;
int ref_length = 0;
int alphaCounter = 0;
int alphaCounterTime = 0;
int messageCounter = 0;
int word_counter = 0;
FILE* input;
FILE* output;
//long fileLength;
//int bufferLength;
//char readFile;
//int forkValue;
input = fopen(argv[2],"r");
output = fopen(argv[3],"w");
int c;
c = fgetc(input);
while(c != EOF)
{
while((argv[1])[ref_length] !='\0')
{
// if string is "HEY", (argv[1]) is HEY, ref_counter is the length
// which in this case will be 3.
ref_length++; //<-- takes care of the length.
}
while(alphaCounter < ref_length)
{
// this will add to alphaCounter everyetime alphaCT is success.
alphaCounterTime += isAlpha((argv[1])[alphaCounter]);
alphaCounter++;
}
if(alphaCounterTime != ref_length)
{
return 0;
}
if((messageCounter == 0 ) && (message[messageCounter + ref_length] == ' ' || message[messageCounter] == '\n' || message[messageCounter]== '\t')) // counts the whole things and brings me to space
{
// compare the message with the word
word_counter += CheckFunctionn(ref_length, messageCounter, argv[1], message);
}
if((message[messageCounter] == ' ' || message[messageCounter] == '\n' || message[messageCounter]== '\t') && (message[messageCounter + ref_length + 1] == ' ' || message[messageCounter + ref_length + 1] == '\n' || message[messageCounter + ref_length + 1]== '\t'))
{
word_counter += CheckFunctionn(ref_length, messageCounter + 1, argv[1], message);
}
if((message[messageCounter]== ' '|| message[messageCounter] == '\n' || message[messageCounter]== '\t') && (messageCounter + ref_length+1)== counter) //<-- this means the length of the message is same
{
word_counter += CheckFunctionn(ref_length, messageCounter + 1, argv[1], message);
}
messageCounter++;
}
fclose(input);
fclose(output);
return 0;
}
You're almost certainly failing to open the input file. If fopen fails, it returns NULL, and calling fgetc(NULL) has undefined behavior, and a segmentation fault is one possible outcome of undefined behavior.
You need to check for errors and handle then accordingly. You also need to check if your program was given sufficient arguments. Here's one way to handle them:
if (argc < 3)
{
fprintf(stderr, "Usage: %s input-file output-file\n", argv[0]);
exit(1);
}
input = fopen(argv[1],"r");
if (input == NULL)
{
fprintf(stderr, "Error opening input file %s: %s\n", argv[1], strerror(errno));
exit(1);
}
output = fopen(argv[2],"w");
if (output == NULL)
{
fprintf(stderr, "Error opening output file %s: %s\n", argv[2], strerror(errno));
exit(1);
}
You only read one character into c, then loop while(c != EOF) which is almost always an infinite loop. Inside that loop, you increment messageCounter which you use to walk past the end of an array -- boom!
Per your comment, argc is 2, but you refer to argv[2] which would be the third element of the args, and will be NULL. The FILE * is going to end up being NULL too (because it's invalid to pass NULL to fopen).
It will be very easy if you use strcmp function in this...
What you have to do is first find the length of ur file using ftell and after that allocate that much memory then fill that memory using fgetc or fgets or any other file function...then just use strcmp function on that....bingo!!!!! :)

Implementing fgetc; trying to read word by word

I am trying to read word by word, and below is the logic that I have adopted. This is reading in the words fine, except when it gets to the last word in a line, in which it stores the last word of the current file AND the 1st word of the next new line. Could somebody tell me how I can get this to work?
int c;
int i =0;
char line[1000]
do{
c = fgetc(fp);
if( c != ' '){
printf("%c", c);
line[i++] = c;
}else if((c == '\n')){
//this is where It should do nothing
}else{
line[i] = '\0';
printf("\\0 reached\n");//meaning end of one word has been reached
strcpy(wordArr[counter++].word, line);//copy that word that's in line[xxx] to the struct's .word Char array
i=0;//reset the line's counter
}//if loop end
} while(c != EOF);//do-while end
fp is a file pointer.
HI BABY TYPE MAYBE
TODAY HELLO CAR
HELLO ZEBRA LION DON
TYPE BABY
I am getting (w/o quotes)
"HI"
"BABY"
"TYPE"
"MAYBE
TODAY"
Look at this:
if(c != ' ') {
// ...
} else if(c == '\n') {
// WILL NEVER BE REACHED
}
If c == '\n', then c != ' ' is also true, which means the second block will be skipped, and the first block will run for all '\n' characters, (i.e. they will be printed).
Other answers about line endings are wrong. C FILE *s not opened in binary mode will take care of EOL for you. If you have a file from DOS and you read it on Unix it might create problems, but I doubt that's your problem here, and if it was handling it could be a little more complicated than the answers here show. But you can cross that bridge when you reach it.
The encoding of the line terminating character is different from one operating system to another. In Linux, it is simply '\n', while in Windows and DOS it is '\r\n'. So, depending on your target OS, you may need to change your statement in something like:
if((c == '\r' || (c == '\n'))
{
//...
}
EDIT: after looking closely, I think that what you're doing wrong is that the first if statement is true even when you read the \n, so you should handle it this way:
if((c != ' ') && (c != '\n')){
printf("%c", c);
line[i++] = c;
}
else if((c == '\n') || (c == '\r')){
//this is where It should do nothing
}
else{
//...
}
Try this;
if((c == '\n') || (c == '\r'){
change
if( c != ' ')
to
if( c != ' '&&c!='\n')
this should fix the problem
This works for me (on Linux):
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int
main(int argc, char **argv)
{
char c;
size_t i = 0;
FILE *file = NULL;
char buffer[BUFSIZ];
int status = EXIT_SUCCESS;
if (argc < 2) {
fprintf(stderr, "%s <FILE>\n", argv[0]);
goto error;
}
file = fopen(argv[1], "r");
if (!file) {
fprintf(stderr, "%s: %s: %s\n", argv[0], argv[1],
strerror(errno));
goto error;
}
while (EOF != (c = fgetc(file))) {
if (BUFSIZ == i) {
fprintf(stderr, "%s: D'oh! Write a program that "
"doesn't use static buffers\n",
argv[0]);
goto error;
}
if (' ' == c || '\n' == c) {
buffer[i++] = '\0';
fprintf(stdout, "%s\n", buffer);
i = 0;
} else if ('\r' == c) {
/* ignore */
} else {
buffer[i++] = c;
}
}
exit:
if (file) {
fclose(file);
}
return status;
error:
status = EXIT_FAILURE;
goto exit;
}

Resources