Why is this program yielding wrong output - c

This program is supposed to remove all comments from a C source code (in this case comments are considered double slashes '//' and a newline character '\n' and anything in between them, and also anything between '/* ' and '*/'.
The program:
#include <stdio.h>
/* This is a multi line comment
testing */
int main() {
int c;
while ((c = getchar()) != EOF)
{
if (c == '/') //Possible comment
{
c = getchar();
if (c == '/') // Single line comment
while (c = getchar()) //While there is a character and is not EOF
if (c == '\n') //If a space character is found, end of comment reached, end loop
break;
else if (c == '*') //Multi line comment
{
while (c = getchar()) //While there is a character and it is not EOF
{
if (c == '*' && getchar() == '/') //If c equals '*' and the next character equals '/', end of comment reached, end loop
break;
}
}
else putchar('/'); putchar(c); //If not comment, print '/' and the character next to it
}
else putchar(c); //if not comment, print character
}
}
After I use this source code as its own input, this is the output I get:
#include <stdio.h>
* This is a multi line comment
testing *
int main() {
int c;
while ((c = getchar()) != EOF)
{
if (c == '') ////////////////
{
c = getchar();
if (c == '') ////////////////////
while (c = getchar()) /////////////////////////////////////////
if (c == '\n') ///////////////////////////////////////////////////////////////
break;
else if (c == '*') ///////////////////
{
while (c = getchar()) ////////////////////////////////////////////
{
No more beyond this point. I'm compiling it using g++ on the ubuntu terminal.
As you can see, multi lines comments had only their '/' characters removed, while single line ones, had all their characters replaced by '/'. Apart from that, any '/' characters that were NOT the beginning of a new comment were also removed, as in the line if (c == ''), which was supposed to be if (c == '/').
Does anybody know why? thanks.

C does not take notice of the way you indent your code. It only cares about its own grammar.
Look carefully at your elses and think about which if they attach to (hint: the closest open one).
There are other bugs, as well. EOF is not 0, so only the first while is correct. And what happens if the comment looks like this: /* something **/?

You have some (apparent) logic errors...
1.
while (c = getchar()) //While there is a character and is not EOF
You're assuming that EOF == 0. Why not be explicit and change the preceding line to:
while((c = getchar()) != EOF)
2.
else putchar('/'); putchar(c);
Are both of the putchars supposed to be part of the else clause? If so, you need braces {} around the two putchar statements. Also, give each putchar its own line; it not only looks nicer but it's more readable.
Conclusion
Other than what I've mentioned, your logic looks sound.

As already mentioned, the if/else matching is incorrect. One aditional missing functionality is that you must make it more stateful to keep track of whether you are inside a string or not, e.g.
printf("This is not // a comment\n");

Related

C code doesn't print whole paragraph with newlines

This is my C code:
#include <stdio.h>
int main()
{
int c = getchar();
while (c != EOF) {
if (c != '\n')
putchar(c);
else putchar(32);
c = getchar();
}
return 0;
}
I want to make a program that prints out a paragraph with newlines, by replacing the \n character with spaces. The problem is, it only prints out the last line, when I use the code provided above.
For, example, for the text:
This is
my
text
the result printed is text.
The paragraph is properly printed when I remove the if(), else conditions, and only leave the putchar(), without trying to replace anything.
What's the problem?
Your input file has CRLF newlines. You need to ignore the CR characters when you're replacing LF with space. Otherwise, printing the CR characters will go back to the beginning of the line and overwrite what was already printed.
#include <stdio.h>
int main()
{
int c;
while ((c = getchar()) != EOF) {
if (c == '\n') {
// replace newline with space
putchar(' ');
} else if (c == '\r') {
// ignore CR
} else {
putchar(c);
}
}
return 0;
}

Multiple blank lines are not squeezed in one blank line(C) using I/O redirection

I am asked to squeezed two or more consecutive blank lines in the input as one blank line in the output. So I have to use Cygwin to do I/O or test it.
Example: ./Lab < test1.txt > test2.txt
my code is:
int main(void){
format();
printf("\n");
return 0;
}
void format(){
int c;
size_t nlines = 1;
size_t nspace = 0;
int spaceCheck = ' ';
while (( c= getchar()) != EOF ){
/*TABS*/
if(c == '\t'){
c = ' ';
}
/*SPACES*/
if (c ==' '){/*changed from isspace(c) to c==' ' because isspace is true for spaces/tabs/newlines*/
/* while (isspace(c = getchar())); it counts while there is space we will put one space only */
if(nspace > 0){
continue;
}
else{
putchar(c);
nspace++;
nlines = 0;
}
}
/*NEW LINE*/
else if(c == '\n'){
if(nlines >0){
continue;
}
else{
putchar(c);
nlines++;
nspace = 0;
}
}
else{
putchar(c);
nspace = 0;
nlines = 0;
}
}
}
However my test2.txt doesn't have the result I want. Is there something wrong in my logic/code?
You provide too little code, the interesting part would be the loop around the code you posted...
What you actually have to do there is skipping the output:
FILE* file = ...;
char c, prev = 0;
while((c = fgets(file)) != EOF)
{
if(c != '\n' || prev != '\n')
putchar(c);
prev = c;
}
If we have an empty line following another one, then we encounter two subsequent newline characters, so both c and prev are equal to '\n', which is the situation we do not want to output c (the subsequent newline) – and the inverse situation is any one of both being unequal to '\n', as you see above – and only then you want to output your character...
Side note: prev = 0 – well, I need to initalise it to anything different than a newline, could as well have been 's' – unless, of course, you want to skip an initial empty line, too, then you would have to initialise it with '\n'...
Edit, referring to your modified code: Edit2 (removed references to code as it changed again)
As your modified code shows that you do not only want to condense blank lines, but whitespace, too, you first have to consider that you have two classes of white space, on one hand, the newlines, on the other, any others. So you have to differentiate appropriately.
I recommend now using some kind of state machine:
#define OTH 0
#define WS 1
#define NL1 2
#define NL2 3
int state = OTH;
while (( c= getchar()) != EOF )
{
// first, the new lines:
if(c == '\n')
{
if(state != NL2)
{
putchar('\n');
state = state == NL1 ? NL2 : NL1;
}
}
// then, any other whitespace
else if(isspace(c))
{
if(state != WS)
{
putchar(' ');
state = WS;
}
}
// finally, all remaining characters
else
{
putchar(c);
state = OTH;
}
}
First differentiation occurs to the current character's own class (newline, whitespace or other), second differentiation according to the previous character's class, which defines the current state. Output occurs always for any non-whitespace character or if the two subsequent whitespace characters only, if they are of different class (newline is a little specific, I need two states for, as we want to leave one blank line, which means we need two subsequent newline characters...).
Be aware: whitespace only lines do not apply as blank lines in above algorithm, so they won't be eliminated (but reduced to a line containing one single space). From the code you posted, I assume this is intended...
For completeness: This is a variant removing leading and trailing whitespace entirely and counting whitespace-only lines as empty lines:
if(c == '\n')
{
if(state != NL2)
{
putchar('\n');
state = state == NL1 ? NL2 : NL1;
}
}
else if(isspace(c))
{
if(state == OTH)
state = WS;
}
else
{
if(state == WS)
{
putchar('');
}
putchar(c);
state = OTH;
}
Secret: Only enter the whitespace state, if there was a non-ws character before, but print the space character not before you encounter the next non-whitespace.
Coming to the newlines - well, if there was a normal character, we are either in state OTH or WS, but none of the two NL states. If there was only whitespace on the line, the state is not modified, thus we remain in the corresponding NL state (1 or 2) and skip the line correspondingly...
To dissect this:
if(c == '\n') {
nlines++;
is nlines ever reset to zero?
if(nlines > 1){
c = '\n';
And what happens on the third \n in sequence? will nlines > 1 be true? Think about it!
}
}
putchar(c);
I don't get this: You unconditionally output your character anyways, defeating the whole purpose of checking whether it's a newline.
A correct solution would set a flag when c is a newline and not output anything. Then, when c is NOT a newline (else branch), output ONE newline if your flag is set and reset the flag. I leave the code writing to you now :)

comments longer than one line

So I was wondering if this is right; I have to count the comments longer than one line in a file:
void commentsLongerThanOneLine(FILE* inputStream, FILE* outputStream) {
char c;
int i = 0;
while ((c = fgetc(inputStream) != EOF)) {
if (c == '/' && '*' && '\n') i++;
}
printf("Number of comments longer than one line is : %d\n", i);
return 0;
}
&& means boolean AND the test for equality is == not =,
You would need to have each indicator tested individually.
if ((c == '/') && (c == '*') && (c == '\n'))
Of course this would always generate FALSE as it is an impossible statement.
Even if the line worked the way you think, it would always be false as you seem to test c against slash AND asterix AND new line which is impossible. You need to check against slash set a flag and check if the next character is asterix Then verify that there is a new line before the end of the comment (*/)
// can span multiple lines and you need to check that the /* is not inside a // and vice versa.

C - Reading from a file issue with the last line

The expected input to my program for my assignment is something like
./program "hello" < helloworld.txt. The trouble with this however is that I must analyse every line that is in the program, so I have used the guard for the end of a line as:
while((c = getchar()) != EOF) {
if (c == '\n') {
/*stuff will be done*/
However, my problem with this is that if the helloworld.txt file contains:
hello
world
It will only read the first line(up to the second last line if there were to be more lines).
For this to be fixed, I have to strictly make a new line such that helloworld.txt looks something like:
hello
world
//
Is there another way around this?
Fix your algorithm. Instead of:
while((c = getchar()) != EOF) {
if (c == '\n') {
/* stuff will be done */
} else {
/* buffer the c character */
}
}
Do:
do {
c = getchar();
if (c == '\n' || c == EOF) {
/* do stuff with the buffered line */
/* clear the buffered line */
} else {
/* add the c character to the buffered line */
}
} while (c != EOF);
But please note that you shouldn't use the value of the c variable if it is EOF.
You need to re-structure your program so it can "do stuff" on EOF, if it has read any characters since the previous linefeed. That way, a non-terminated final line will still be processed.

reading a file using getc and skipping a line if it starts with semicolon

while((c = getc(file)) != -1)
{
if (c == ';')
{
//here I want to skip the line that starts with ;
//I don't want to read any more characters on this line
}
else
{
do
{
//Here I do my stuff
}while (c != -1 && c != '\n');//until end of file
}
}
Can I completely skip a line using getc if first character of line is a semicolon?
Your code contains a couple of references to -1. I suspect that you're assuming that EOF is -1. That's a common value, but it is simply required to be a negative value — any negative value that will fit in an int. Do not get into bad habits at the start of your career. Write EOF where you are checking for EOF (and don't write EOF where you are checking for -1).
int c;
while ((c = getc(file)) != EOF)
{
if (c == ';')
{
// Gobble the rest of the line, or up until EOF
while ((c = getc(file)) != EOF && c != '\n')
;
}
else
{
do
{
//Here I do my stuff
…
} while ((c = getc(file)) != EOF && c != '\n');
}
}
Note that getc() returns an int so c is declared as an int.
Let's assume that by "line" you mean a string of characters until you hit a designated end-of-line character (here assumed as \n, different systems use different characters or character sequences like \r\n). Then whether the current character c is in a semicolon-started line or not becomes a state information which you need to maintain across different iterations of the while-loop. For example:
bool is_new_line = true;
bool starts_with_semicolon = false;
int c;
while ((c = getc(file) != EOF) {
if (is_new_line) {
starts_with_semicolon = c == ';';
}
if (!starts_with_semicolon) {
// Process the character.
}
// If c is '\n', then next letter starts a new line.
is_new_line = c == '\n';
}
The code is just to illustrate the principle -- it's not tested or anything.

Resources