C program to read specific lines from a file - c

I'm trying to make a program that will count the lines in a file and will refer to specific lines as another count(i.e lines that start with a # should not be counted)
while(fgets(tempstring,sizeof(tempstring),fptr)){
lines++;
if(tempstring[0] != '#' || tempstring[0]!='\n'|| tempstring[0]!=' '){
++count;
}
Now what am I doing wrong here?
Also i have noticed that the first time i call fgets i get ∩ as an output for tempstring[0] why is that?

Your condition is always true - you wanted to either use &&, or negate the overall ||:
if (tempstring[0] != '#' && tempstring[0]!='\n' && tempstring[0]!=' ')
or
if(!(tempstring[0] == '#' || tempstring[0] == '\n' || tempstring[0] == ' '))
which is equivalent. Note that you can remove if altogether, because true in C is the same as 1:
count += (tempstring[0] != '#' && tempstring[0]!='\n' && tempstring[0]!=' ');
Also note that fgets may or may not give you the beginning of line, depending on sizeof(tempstring). If tempstring is not long enough for the whole string from the file, your call may produce a string from the middle of another string, causing incorrect behavior. This is harder to fix, because now you need a loop that checks for the last character of the string returned from fgets to be '\n'.

Related

Role of else if (state == OUT) in C word count

I'm learning C following the book "The C Programming Language" - K&R;
I found myself stuck in the understanding of the role of else if (state == OUT):
#define IN 1
#define OUT 0
main ()
{
int c, nw, state;
state = OUT;
nw = 0;
while ((c = getchar()) != EOF) {
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
++nw;
}
}
printf ("%d", nw);
}
In the word counting program, I mean, in the way I read it there must be something I am doing wrong, because I fail to understand why this makes the difference, from simple else, since state = OUT is already default condition; but in practice I observe that it does, because if I write just else
then the statement state = IN; ++nw will count characters and not words;
from the way I read it, the loop is saying that for each input character (stored in the variable c), if it is a space, a new line, or a tab, then it's value is zero, everything else, will be 1, so I fail to see how it is grouping characters into words, because state was OUT already before the loop, so how is else if (state == OUT) getting the program to put the characters into one word?
I have been thinking whole night about it but I couldn't find an answer in my thoughts, nor in the book
To count words, we want to increment the count, with ++nw, only once per word.
If we write:
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else {
state = IN;
++nw;
}
then ++nw will be executed every time c is not one of the white-space characters space, new-line, or tab. However, by writing:
if (c == ' ' || c == '\n' || c == '\t')
state = OUT;
else if (state == OUT) {
state = IN;
++nw;
}
then ++nw will not be executed when we are already in a word (state is IN). It will be executed only when we are out of a word (state is OUT) and are going into a new word (because c is not one of the white-space characters). Thus, ++nw is executed only when we start a new word, not for each character in the word.
cccc cc c c cccccc cc
^ ^ ^ ^ ^ ^
| | | | | |
These are the two ways to count words - only the transitions count.
Geometrically this is easy and intuitive: every c that has a white space to its left.
But if you step (once, blindly) through each element, you only have to store the last character, which corresponds to the "left". This minimal memory makes this algorithm a state machine.
When you hit a letter, you only count it if you come from OUTside a word, but at the same time set the state to INside, so the next letter does not get counted. The next space then will set the trigger to "OUT".
This statement is checking character (c) values:
if (c == ' ' || c == '\n' || c == '\t')
and if any character (c) is a space or newline or tab then the state value is changed to OUT.
This statement else if (state == OUT) is checking the value of state whenever a character (c) is not a space, newline or tab.
The assumption of the program is that all words are separated by a space, newline or tab character and is at least one character long.
Punctuation characters such as :;,.?! etc. are included as legitimate characters for words.
Use of getchar ensures that at least one character is read.
The program will complete when it detects an End Of File character i.e. no further input.
The word count is increased on the first character that is not space, newline or tab.
This program counts words by modeling a very simple state machine with two states:
OUT: We have not read a character that is part of a word
IN: We have read a character that is part of a word
The number of words is determined by the number of times we transition from the OUT to the IN state. Given an output like "This is a test", the transitions work like this:
This is a test
^+++v--^+v^v----^+++
where + represents the IN state, - represents OUT state, ^ represents the transition from OUT to IN, and v represents the transition from IN to OUT. The number of words is equal to the number of times we see ^.
In order to know whether or not we're making the transition from the OUT to the IN state, we have to check the current value of state when we see a non-whitespace character; hence, the
else if (state == OUT)
as the alternate branch instead of a plain else. Otherwise we'd increment nw for every non-whitespace character, rather than on the transition from OUT to IN.

How to read character until a char character?

I'm trying to do a loop that read from a file a single character until it finds '\n' or '*'.
This is the loop that I wrote:
i=0;
do {
fscanf(fin,"%c",&word[i]);
i++;
} while(word[i]!='*'&&word[i]!='\n');
Now I tried to see why it doesn't work with a debugger. When I print word[i] immediately after the fscanf it shows me the correct character but then, if I try to print word[i] while it's doing the comparison with '*' or '\n' it shows me that word[i] is '\000' and the loop never ends.
I also tried with fgetc but I have the same error.
You have to make sure that the character you are processing is the same you just read.
Actually you increment counter i before testing word [i], that's why your check fails.
Try instead
i=0;
do {
fscanf(fin,"%c",&word[i]);
}while(word[i]!='*'&&word[i++]!='\n');
I would rather move the check in the loop (break if the condition is satisfied) leaving in the while check the test on word array length.
Another way:
for(;;) {
int c = fgetc(fin);
if ( c == EOF ) {
break;
word[i] = c;
if( c == '*' || c == '\n' ) {
break;
}
}
Your while condition is not testing the same element of word that you just read, because i++ incremented the variable before the test.
Change the test to use word[i-1] instead of word[i] to adjust for this.
BTW, word[i] = fgetc(fin); is a simpler way to read one character.

Copying string:while loop breaks on the first condtion only using C

I want to copy a string and want to stop copying either the next character is '\0' or '.'
so I wrote
while((dest[i]=src[i])!='\0'||src[i]=='.');
i++;
when the character is '\0' the while loop works perfectly
but in case of '.'
must I write a separate "if condition" for the second part ?and why?
You have an infinite loop there.
while((dest[i]=src[i])!='\0'||src[i]=='.'); // This is the end of the loop,
// with an empty statement.
Also, you need to change the conditional a little bit.
(dest[i]=src[i]) != '\0' && src[i] != '.'
To avoid the empty statement problem after while and if statements, you can change your coding standard so that you always use the {}.
while ( (dest[i]=src[i]) != '\0' && src[i] != '.' )
{
++i
}

How to check and validate user input is one of two valid choices

I have the following code asking for user input either A or P. I have the same sort of setup for hour and minutes, where hour would be between 1 and 12 and minutes would be between 0 and 59. That part of my code is thoroughly working.
My issue is that I don't know how to check what the timePeriod variable is and ensure that it is either A or P and to print an error message and prompt again if it is anything else including lowercase a and p. User input has to be in uppercase and ONLY A or P.
I've only put the function code here. I added the clean_stdin code as well so the while statement inside getTimePeriod might be easier to understand. As I said before, I'm using a similar set up for both the hour and minutes and that's working.
char getTimePeriod(void)
{
char timePeriod, term;
while ( (((scanf("%c%c",&timePeriod,&term)!=2 || term !='\n') && clean_stdin()) || timePeriod != "A" || timePeriod != "P") && printf("Invalid value entered. Must be A or P. Please re-enter: ") );
return timePeriod;
}
int clean_stdin()
{
while (getchar()!='\n');
return 1;
}
Edit: For those getting their panties in a twist about this being bad code, it works for me based on my assignment requirements for an Intro to C course. Hope that clarifies the noob-ness of this question as well.
Also note that
timePeriod != 'A'
does not work. I don't know why but it doesn't work.
Recommend separating user input from validation.
scanf() tries to do both at once. It is easier to handle potential wrong user input if simply a line of input is read (fgets() - standard or getline() common #Jonathan Leffler) and then parsed by various means.
// return choice or EOF
int GetChoice(const char *prompt, const char *reprompt, const char *choices) {
char buf[10];
puts(prompt);
while (fgets(buf, sizeof buf, stdin)) {
buf[strcspn(buf, "\n")] = 0; // drop potential trailing \n
char *p = strchr(choices, buf[0]);
if (p && buf[1] == '\0') {
// Could fold upper/lower case here if desired.
return *p;
}
puts(reprompt);
}
return EOF;
}
int timePeriod = GetChoice("TimePeriod A or P", "Try Again", "AP");
switch (timePeriod) {
case 'A' : ...
case 'P' : ...
default: ...
Additional checks could be added. That is the best part about rolling this off to a helper function, it can be used is multiple places in code and be improved as needed in a localized manner.
OP code comments:
It user input is not as expected, it is unclear that OP's complex while() condition will properly empty user's line of input. It certainly has trouble if EOF is encountered or if first char is a '\n'.
timePeriod != "A" as commented by #Alan Au is not the needed code. That compares timePeriod to the address of the string "A". Use timePeriod != 'A'.
clean_stdin() should be clean_stdin(void). It is an infinite loop on EOF. Consider:
int ch;
while ((ch = getchar()) != '\n' && ch != EOF);
The problem you're having is here:
timePeriod != "A" || timePeriod != "P"
First, as was mentioned before, you can't compare a character to a string. You need to use single quotes instead of double quotes. Assuming that's been corrected, this conditional will always be true since timePeriod will always either not be 'A' or not be 'P'. This needs to be a logical AND:
(timePeriod != 'A' && timePeriod != 'P')
Note also that an extra set of parenthesis were added to make sure that the order of operation in your while expression is preserved.
Regarding the "cute" comment, what that means is that cramming a bunch of statements in the while expression and leaving the body blank makes your code difficult to read and therefore more prone to bugs. Had you broken that up into multiple statements each doing one logical thing you would have been more likely to find this bug yourself. Olaf made the comment he did primarily to warn other users who come across your code about just that.

Word count in C?

I have a problem with counting words in std. I use the same method when I count words in files there works OK.
My method is as follows: We read until ctrl+d. If the next character is a line return, increase new_lines. Otherwise, we increase the words because the next method (last if) doesn't read until first space and I lost first word. In the end If the current character is a space and next element is something other than a space, increase words.
Now I'm going to explain about problem. If I have empty line program increase words but why I use second if for this. If I don't have empty lines program work.
int status_read=1;
while (status_read > 0){ // read to end of file
status_read = read(STDOUT_FILENO, buff, 9999); // read from std
for (i = 0; i < status_read ; i++) { // until i<status_read
if (buff[i] == '\n') {
new_lines++;
if (buff[i+1]!='\n')
wordcounter++;
}
if (buff[i] == ' ' && buff[i+1]!=' ')
wordcounter++;
}
}
As #FredLarson commented, you are trying to read from standard out, not standard in (that is, you should be using STDIN_FILENO, not STDOUT_FILENO).
If I have empty line program increase words but why I use second if
for this. If I don't have empty lines program work.
That's due to
if (buff[i] == '\n') {
new_lines++;
if (buff[i+1]!='\n')
wordcounter++;
}
- to solve this problem, just don't increment wordcounter here - replace the above with
if (buff[i] == '\n') ++new_lines;
Otherwise,
we increase the words because the next method (last if) doesn't read
until first space and I lost first word.
To avoid the problem of losing the first word on a line, as well as that with buff[i+1] (see M Oehm's comments above), I suggest changing
if (buff[i] == ' ' && buff[i+1]!=' ')
wordcounter++;
to
if (wasspace && !isspace(buff[i])) ++wordcounter;
wasspace = isspace(buff[i]);
- wasspace being defined and initialized to int wasspace = 1; before the file read loop.

Resources