I have written code to remove comments from a C program file, and print the output on the console:
#include <stdio.h>
#include <stdlib.h>
void incomment(FILE *fp);
void rcomment(int c, FILE *fp);
void echo_quote(int c, FILE *fp);
int main() {
FILE *fp;
fp = fopen("temp.c", "r");
int c;
while ((c = getc(fp)) != EOF) {
rcomment(c, fp);
}
return 0;
}
void incomment(FILE* fp) {
int c, d;
c = getc(fp);
d = getc(fp);
while (c != '*' && d != '/') {
c = d;
d = getc(fp);
}
}
void echo_quote(int c, FILE *fp) {
int d;
putchar(c);
while ((d = getc(fp)) != c) {
putchar(d);
if (d == '\\')
putchar(getc(fp));
}
putchar(d);
}
void rcomment(int c, FILE *fp) {
int d;
if (c == '/') {
if ((d = getc(fp)) == '*')
incomment(fp);
else
if (d == '/') {
putchar(c);
rcomment(d, fp);
} else {
putchar(c);
putchar(d);
}
} else
if (c == '\'' || c == '"')
echo_quote(c, fp);
else
putchar(c);
}
However for the following input:
#include<stdio.h>
/* Author : XYZ
* Date : 21/1/2016
*/
int main()
{
int a; // / variable a
printf("/*Hi*/");
return 0;
}
OUTPUT:
#include<stdio.h>
Date : 21/1/2016
*/
int main()
{
int a; // / variable a
printf("/*Hi*/");
return 0;
}
Could someone point out the error in the code. It seems to be working fine for comments within quotes. But not for the single line comments.
The rcomment() function does not parse the single line comments correctly:
If you match a '/' for the second character, you should read all remaining characters upto the newline and output just the newline.
If the second character is a quote, you fail to output the first character and parse the literal. An easy way to do this is to unget the second character with ungetc(d, fp); and only output c.
There are other special cases you do not handle:
Escaped newlines should be handled in literals and single line comments as well as in between the / and the * at the start of a multi-line comment and between the * and the / at the end. You can do this simply by using a utility function to read the bytes from the file that handles escaped newlines but it will be difficult to output them back to the output file to preserve the line counts.
You should replace multiline comments with a single space or newline characters to avoid pasting tokens and to preserve the line count.
incomment() and echo_quote() should handle a premature end of file. As currently coded, they run indefinitely.
This parsing task is more subtile than it looks. You could try another approach and implement a state machine.
Here is a quick fix for the rcomment() function, but the other issues above remain:
int peekc(FILE *fp) {
int c = getc(fp);
if (c != EOF)
ungetc(c, fp);
return c;
}
void rcomment(int c, FILE *fp) {
int d;
if (c == '/') {
if ((d = getc(fp)) == '*') {
incomment(fp);
} else
if (d == '/') {
while ((c = getc(fp)) != EOF && c != '\n') {
if (c == '\\' && peekc(fp) == '\n') {
putchar(getc(fp));
}
}
putchar('\n');
} else {
putchar(c);
ungetc(d, fp);
}
} else
if (c == '\'' || c == '"') {
echo_quote(c, fp);
} else {
putchar(c);
}
}
Related
I am working on a management system project and want to clear the file before adding data to it. I am using this code as a reference. I have rewritten the code from the reference and instead of writing the data from the temporary file(tmp) back to the original(FILE_NAME), I have printed it out to the terminal.
When I compile and run the program, it prints all the content and a few more lines after the end of the file. After this it stops and doesn't finish execution. I have added to comments to help understand my thought process better.
#include<stdio.h>
#include<stdlib.h>
#include<ctype.h>
#define BUFFER_SIZE 1000
#define FILE_NAME "data.csv"
int main()
{
FILE* file;
char buffer[BUFFER_SIZE];
// Opening file
if(file = fopen(FILE_NAME, "r+"))
{
char c; // To get character from buffer
int i = 0; // Index for the buffer character
int isEmpty = 1; // If the line is empty
FILE* tmp;
if(tmp = tmpfile())
{
while(1)
{
buffer[i++] = c;
if(c != '\n') // Checking for blank lines
{
isEmpty = 0;
}
else
{
if(c == '\n' && isEmpty == 0) // Read a word; Print to tmp file
{
buffer[i] = '\0';
fprintf(tmp, "%s", buffer);
i = 0;
isEmpty = 1;
}
else if(c == '\n' && isEmpty == 1) // NOT SURE WHY THIS IS IMPORTANT
{
buffer[i] = '\0';
i = 0;
isEmpty = 1;
}
}
if(c == EOF)
{ break; }
while(1) // Loop to print contents of tmp file onto terminal
{
c = getc(tmp);
printf("c: %c", c);
if(c == EOF)
{ break; }
}
}
}
else
{
printf("Unable to open temporary file\n");
}
fclose(file);
}
else
{
printf("Unable to open file.");
}
getchar();
return 0;
}
UPDATE:
I've modified a few lines and have got it working.
I'd forgotten to assign c in the above program. Also #Barmar won't char c work just as well as int c. Characters can be integers as well right?
Why would large indentations lead to bugs? I find the blocks of code to be more differetiated.
#include<stdio.h>
#include<stdlib.h>
#include<ctype.h>
#define BUFFER_SIZE 1000
#define FILE_NAME "data.csv"
int main()
{
// Variable Declaration
FILE* file;
char buffer[BUFFER_SIZE];
// Opening file
if( file = fopen(FILE_NAME, "r+") )
{
char c; // Reading characters from the file
int i; // Index of the characters
int isEmpty = 1; // 1-> character is empty; 0-> character is not empty
FILE* tmp;
if( tmp = fopen("tmp.csv", "a+") )
{
char c; // Reading characters from files
int i = 0; // Index
int isEmpty = 1; // 1->previous word is empty; 0->previous word is not empty
while( (c = getc(file)) != EOF)
{
if( c != '\n' && c != ' ' && c != '\0' && c != ',')
{
isEmpty = 0;
buffer[i++] = c;
}
else
{
if( c == '\n' && isEmpty == 0 )
{
buffer[i] = '\0';
fprintf(tmp, "%s", buffer);
i = 0;
isEmpty = 1;
}
else if( c == '\n' && isEmpty == 1 )
{
buffer[i] = '\0';
i = 0;
}
}
}
fclose(tmp);
}
else
{
printf("Unable to open temporary file\n");
}
fclose(file);
}
else
{
printf("Unable to open file\n");
}
return 0;
}
Are there are ways to simplify the program and make it more compact or less error prone?
Stack Overflow! I am on my learning process with the C technology. I have a function which gets an input file, seeks through the file and writes the contents to the output file without the comments.
The function works but it also brakes at some cases.
My Function:
void removeComments(char* input, char* output)
{
FILE* in = fopen(input,"r");
FILE* out = fopen(ouput,"w");
char c;
while((c = fgetc(in)) != EOF)
{
if(c == '/')
{
c = fgetc(in);
if(c == '/')
{
while((c = fgetc(in)) != '\n');
}
else
{
fputc('/', out);
}
}
else
{
fputc(c,out);
}
}
fclose(in);
fclose(out);
}
But when I give a file like this as input:
// Parameters: a, the first integer; b the second integer.
// Returns: the sum.
int add(int a, int b)
{
return a + b; // An inline comment.
}
int sample = sample;
When removing the inline comment it fails to reach the '\n' for some reason and it gives output:
int add(int a, int b)
{
return a + b; }
int sample = sample;
[EDIT]
Thanks for helping me! It works with the case I posted but it brakes in another.
Current code:
FILE* in = fopen(input,"r");
FILE* out = fopen(output,"w");
if (in == NULL) {
printf("cannot read %s\n", input);
return; /* change signature to return 0 ? */
}
if (out == NULL) {
printf("cannot write in %s\n", output);
return; /* change signature to return 0 ? */
}
int c;
int startline = 1;
while((c = fgetc(in)) != EOF)
{
if(c == '/')
{
c = fgetc(in);
if(c == '/')
{
while((c = fgetc(in)) != '\n')
{
if (c == EOF) {
fclose(in);
fclose(out);
return; /* change signature to return 1 ? */
}
}
if (! startline)
fputc('\n', out);
startline = 1;
}
else if (c == EOF)
break;
else {
fputc('/', out);
startline = 0;
}
}
else
{
fputc(c,out);
startline = (c == '\n');
}
}
fclose(in);
fclose(out);
When the file contains division the second variable disappears.
Example:
int divide(int a, int b)
{
return a/b;
}
It gives back:
int divide(int a, int b)
{
return a/;
}
after
while((c = fgetc(in)) != '\n');
you need a fputc('\n', out);
Additional remarks :
In
char c;
while((c = fgetc(in)) != EOF)
c must be an int to manage EOF
Just a typo : ouput must be output to compile
You do not manages well the EOF after you read a '/'
You missed to check the result of the fopen
A proposal :
#include <stdio.h>
void removeComments(char* input, char* output)
{
FILE* in = fopen(input,"r");
FILE* out = fopen(output,"w");
if (in == NULL) {
printf("cannot read %s\n", input);
return; /* change signature to return 0 ? */
}
if (out == NULL) {
printf("cannot write in %s\n", output);
return; /* change signature to return 0 ? */
}
int c;
while((c = fgetc(in)) != EOF)
{
if(c == '/')
{
c = fgetc(in);
if(c == '/')
{
while((c = fgetc(in)) != '\n')
{
if (c == EOF) {
fclose(in);
fclose(out);
return; /* change signature to return 1 ? */
}
}
fputc('\n', out);
}
else if (c == EOF) {
fputc('/', out);
break;
}
else
fputc('/', out);
fputc(c, out);
}
else
{
fputc(c,out);
}
}
fclose(in);
fclose(out);
/* change signature to return 1 ? */
}
int main(int argc, char ** argv)
{
removeComments(argv[1], argv[2]);
}
As Tormund Giantsbane says in a remark it is better to completely remove the line containing only a comment (comment starting on the first column), that new proposal does that :
#include <stdio.h>
void removeComments(char* input, char* output)
{
FILE* in = fopen(input,"r");
FILE* out = fopen(output,"w");
if (in == NULL) {
printf("cannot read %s\n", input);
return; /* change signature to return 0 ? */
}
if (out == NULL) {
printf("cannot write in %s\n", output);
return; /* change signature to return 0 ? */
}
int c;
int startline = 1;
while((c = fgetc(in)) != EOF)
{
if(c == '/')
{
c = fgetc(in);
if(c == '/')
{
while((c = fgetc(in)) != '\n')
{
if (c == EOF) {
fclose(in);
fclose(out);
return; /* change signature to return 1 ? */
}
}
if (! startline)
fputc('\n', out);
startline = 1;
}
else if (c == EOF) {
fputc('/', out);
break;
}
else {
fputc('/', out);
fputc(c, out);
startline = 0;
}
}
else
{
fputc(c,out);
startline = (c == '\n');
}
}
fclose(in);
fclose(out);
/* change signature to return 1 ? */
}
int main(int argc, char ** argv)
{
removeComments(argv[1], argv[2]);
}
Compilation and execution :
pi#raspberrypi:/tmp $ gcc -pedantic -Wextra -g r.c
pi#raspberrypi:/tmp $ cat i
// Parameters: a, the first integer; b the second integer.
// Returns: the sum.
int add(int a, int b)
{
return a + b/c; // An inline comment.
}
int sample = sample;
pi#raspberrypi:/tmp $ ./a.out i o
pi#raspberrypi:/tmp $ cat o
int add(int a, int b)
{
return a + b/c;
}
int sample = sample;
As said by DavidC. in a remark if // is placed in a string the result will not be the expected one, it is also the case in a character even illegal (I mean '//' must not be changed), what about the C comments (/* .. // ... */) etc
When removing the inline comment it fails to reach the '\n' for some reason
Well no, if it failed to reach or see the newline at the end of an inline comment then the program would, presumably, consume the entire rest of the file. What it actually fails to do is write such newlines to the output.
Consider your comment-eating code:
while((c = fgetc(in)) != '\n');
That loop terminates when a newline is read. At that point, the newline, having already been read, is not available to be read from the input again, so your general read / write provisions will not handle it. If you want the such newlines to be preserved, then you need to print them in the comment-handling branch.
Additional notes:
fgetc returns an int, not a char, and you need to handle it as such in order to be able to correctly detect end-of-file.
Your program will go into an infinite loop if the input ends with an inline comment that is not terminated by a newline. Such source is technically non-conforming, but even so, you ought to handle it.
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
void handle(FILE *np)// this is to handle newline characters
{
putc('\n', np);
}
/* skip a C multi-line comment, return the last byte read or EOF */
int m_cmnt(FILE *fp, int *lineno_p) {
FILE *np = stdout;
int prev, ch, replacement = ' ';
for (prev = 0; (ch = getc(fp)) != EOF; prev = ch) {
if (prev == '\\' && ch == 'n') {
replacement = '\n';
++*lineno_p;
}
if (prev == '*' && ch == '/')
return replacement;
}
return EOF;
}
int main(int argc, char *argv[]) {
FILE *fp = stdin, *np = stdout;
int ch,prev;
bool String = 0;
const char *filename = "<stdin>";
int lineno = 1;
fp = fopen(filename, "r");
np = fopen(argv[2], "w");
if (argc > 1) {
if ((fp = fopen(filename = argv[1], "r")) == NULL) {
fprintf(stderr, "Cannot open input file %s: \n",
filename);
exit(EXIT_FAILURE);
}
}
if (argc > 2) {
if ((np = fopen(argv[2], "w")) == NULL) {
fprintf(stderr, "Cannot open output file %s: \n",
argv[2]);
exit(EXIT_FAILURE);
}
}
while ((ch = getc(fp)) != EOF) {
if (ch == '\n')
lineno++;
/* file pointer currently not inside a string */
if (!String) {
if (ch == '/') {
ch = getc(fp);
if (ch == '\n')
lineno++;
if (ch == '*') {
int startline = lineno;
ch = m_cmnt(fp, &lineno);
if (ch == EOF) {
fprintf(stderr, "%s:%d: error: unterminated comment started on line %d\n",
filename, lineno, startline);
exit(EXIT_FAILURE);
break;
}
putc(ch, np);
} else {
putc('/', np);
putc(ch, np);
}
}
else if ( ch=='\\')/*to handle newline character*/
{
prev=ch ;
ch= getc(fp) ;
switch(ch)
{
case 'n' :
handle(np);
break ;
/*default :
putc(prev , np) ;
putc(ch , np) ;
break ;*/
}
}
else {
putc(ch, np);
}
} else {
putc(ch, np);
}
if (ch == '"' || ch == '\'')
String = !String;
}
fclose(fp);
fclose(np);
//remove(arr[1]);
//rename("temp.txt", arr[1]);
return EXIT_SUCCESS;
}
I have been working on this project for almost more than a week now. I have asked many questions on this site to help me get the desired result.The basics of this program is to remove multiline comments from source file and write the rest to some output file. It also need to to ignore any thing that is inside a string literal or character literal(like escaped characters). Now I have come to finalize it but I still need to achieve this two outputs shown below
INPUT1 = //*SOMECOMMENT*/
OUTPUT1 = /
INPUT2 = "this \"test"/*test*/
OUTOUT2 = "this \"test"
The current(erroneous) output is shown below
INPUT1 = //*SOMECOMMENT*/
OUTPUT1 = //*SOMECOMMENT*/ This is wrong.
INPUT2 = "this \"test"/*test*/
OUTOUT2 = "this \"test"/*test*/ This is also wrong.
The program don't work for the case where a comment comes after a forward slash(/) and the second failure of the program is it don't ignore escape character inside a string or character literal. I need a fix on this two problems please.
If your problem is that you want to read an input stream of characters, divide that stream into tokens, and then emit only a subset of those tokens, I think Lex is exactly the tool you're looking for.
If I understand your comment correctly, the file you're trying to read in and transform is itself C code. So you will need to build up a Lex definition of the C language rules.
A quick search turned up this Lex specification of the ANSI C grammar. I cannot vouch for its accuracy or speak to its licensing. At first glance it seems to only support C89. But it is probably enough to point you in the right direction.
Hello guys so I write this program which purpose is to open a file and read how many characters has in it and print the line with the most and the least characters.I've made it into two functions one for the biggest line and one for the smallest.The "biggest line" function works just fine but I get wrong output for the smallest one.Here is the code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <conio.h>
char f_view[150];
void ShowResults();
int leastsymbols();
int mostsymbols();
int main(){
ShowResults();
return 0;
}
int mostsymbols(){
FILE *fp;
fp=fopen(f_view, "r");
if(fp==NULL){
printf("Error\n");
exit(-1);
}
int lineNO=1;
int c;
int currCount=0;
int highestCount=0;
int highestline=0;
while ((c = getc(fp)) != EOF){
if (c == '\n') {
currCount=0;
lineNO++;
}
if (c != '\n' && c != '\t' && c!= ' ') {
currCount++;
if(currCount>highestCount){
highestCount=currCount;
if(lineNO>highestline){
highestline=lineNO;
}
}
}
}
fclose(fp);
return highestline;
}
int leastsymbols()
{
FILE *fp;
fp = fopen(f_view, "r");
if (fp == NULL)
{
printf("Could not open file \n");
exit(-1);
}
int c;
int lineNO = 1;
int currCount=0;
int leastLine=0;
int leastCount=1000;//assuming that a line in a file can not be longer
//than 1000 characters
while ((c = getc(fp)) != EOF){
if (c == '\n'){
currCount = 0;
lineNO++;
}
if (c != '\n' && c != '\t' && c!= ' ') {
currCount++;
}
if(currCount<leastCount){
leastCount=currCount;
leastLine=lineNO;
}
}
fclose(fp);
return leastLine;
}
void ShowResults()
{
FILE *fptr;
char *fix;
char c;
char openFile[1024];
printf("Type the destination to the *.c file or the file name.\n");
//the user has to enter a .C file
while(f_view[strlen(f_view) - 2] != '.' && f_view[strlen(f_view) - 1]
!= 'c')
{
fgets(f_view, 150, stdin);
fix = strchr(f_view, '\n');
if(fix != 0)
*fix = 0;
}
if((fptr = fopen(f_view, "r")) == NULL)
{
printf("Cannot open file !\n");
exit(-1);
}
int highestLine;
int lowestLine;
while (fgets(openFile, 1024, fptr))
{
highestLine=mostsymbols();
lowestLine=leastsymbols();
}
printf("Line %d has the most symbols.\n",highestLine);
printf("Line %d has the least symbols.\n",lowestLine);
fclose(fptr);
return ;
}
I fixed my program thank you.:)
while ((c = getc(fp)) != EOF){
if(c == '\n' && currCount<leastCount){
leastCount=currCount;
leastLine=lineNO;
}
if(c=='\n'){
currCount = 0;
lineNO++;
}
if (c != '\n' && c != '\t' && c!= ' ') {
currCount++;
}
}
move this check to when you go to the next line
if(currCount<leastCount){
leastCount=currCount;
leastLine=lineNO;
}
your placement is wrong because currCount is at the first iteration is still 1 or 0 depending on what is first character on the line, so it is the smallest and this is for every new line you read
I am trying to read word by word, and below is the logic that I have adopted. This is reading in the words fine, except when it gets to the last word in a line, in which it stores the last word of the current file AND the 1st word of the next new line. Could somebody tell me how I can get this to work?
int c;
int i =0;
char line[1000]
do{
c = fgetc(fp);
if( c != ' '){
printf("%c", c);
line[i++] = c;
}else if((c == '\n')){
//this is where It should do nothing
}else{
line[i] = '\0';
printf("\\0 reached\n");//meaning end of one word has been reached
strcpy(wordArr[counter++].word, line);//copy that word that's in line[xxx] to the struct's .word Char array
i=0;//reset the line's counter
}//if loop end
} while(c != EOF);//do-while end
fp is a file pointer.
HI BABY TYPE MAYBE
TODAY HELLO CAR
HELLO ZEBRA LION DON
TYPE BABY
I am getting (w/o quotes)
"HI"
"BABY"
"TYPE"
"MAYBE
TODAY"
Look at this:
if(c != ' ') {
// ...
} else if(c == '\n') {
// WILL NEVER BE REACHED
}
If c == '\n', then c != ' ' is also true, which means the second block will be skipped, and the first block will run for all '\n' characters, (i.e. they will be printed).
Other answers about line endings are wrong. C FILE *s not opened in binary mode will take care of EOL for you. If you have a file from DOS and you read it on Unix it might create problems, but I doubt that's your problem here, and if it was handling it could be a little more complicated than the answers here show. But you can cross that bridge when you reach it.
The encoding of the line terminating character is different from one operating system to another. In Linux, it is simply '\n', while in Windows and DOS it is '\r\n'. So, depending on your target OS, you may need to change your statement in something like:
if((c == '\r' || (c == '\n'))
{
//...
}
EDIT: after looking closely, I think that what you're doing wrong is that the first if statement is true even when you read the \n, so you should handle it this way:
if((c != ' ') && (c != '\n')){
printf("%c", c);
line[i++] = c;
}
else if((c == '\n') || (c == '\r')){
//this is where It should do nothing
}
else{
//...
}
Try this;
if((c == '\n') || (c == '\r'){
change
if( c != ' ')
to
if( c != ' '&&c!='\n')
this should fix the problem
This works for me (on Linux):
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int
main(int argc, char **argv)
{
char c;
size_t i = 0;
FILE *file = NULL;
char buffer[BUFSIZ];
int status = EXIT_SUCCESS;
if (argc < 2) {
fprintf(stderr, "%s <FILE>\n", argv[0]);
goto error;
}
file = fopen(argv[1], "r");
if (!file) {
fprintf(stderr, "%s: %s: %s\n", argv[0], argv[1],
strerror(errno));
goto error;
}
while (EOF != (c = fgetc(file))) {
if (BUFSIZ == i) {
fprintf(stderr, "%s: D'oh! Write a program that "
"doesn't use static buffers\n",
argv[0]);
goto error;
}
if (' ' == c || '\n' == c) {
buffer[i++] = '\0';
fprintf(stdout, "%s\n", buffer);
i = 0;
} else if ('\r' == c) {
/* ignore */
} else {
buffer[i++] = c;
}
}
exit:
if (file) {
fclose(file);
}
return status;
error:
status = EXIT_FAILURE;
goto exit;
}