Clarification in getop() - c

In Dennis Ritchie's "C programming Language" book,
In getop func,
he states that s[1]='\0'
why does he end the array on index 1? What's the significance and need?
In later part he does uses other parts of the array..
int getch(void);
void ungetch(int);
/* getop: get next character or numeric operand */
int getop(char s[])
{
int i, c;
while ((s[0] = c = getch()) == ' ' || c == '\t')
;
s[1] = '\0';
if (!isdigit(c) && c != '.')
return c; /* not a number */
i = 0;
if (isdigit(c)) /* collect integer part */
while (isdigit(s[++i] = c = getch()))
;
if (c == '.') /* collect fraction part */
while (isdigit(s[++i] = c = getch()))
;
s[i] = '\0';
if (c != EOF)
ungetch(c);
return NUMBER;
}

Because the function might return before the remaining input is read, and then s needs to be a complete (and terminated) string.

The zero char will be overwritten later, unless the test if (!isdigit(c) && c != '.') is true so that the function returns early. Some coding guidelines would discourage returns from the middle of a function, btw.
Only when the function returns early is the s[1] = '\0' relevant. Therefore, one could have coded
int getop(char s[])
{
int i, c;
while ((s[0] = c = getch()) == ' ' || c == '\t')
;
if (!isdigit(c) && c != '.')
{
s[1] = '\0'; /* Do this only when we return here */
return c; /* not a number */
}
i = 0;
if (isdigit(c)) /* collect integer part */
/* ... */
It is terse code. These guys knew their language and algorithms. But the snippet lacks error handling and thus depends on correct input (invalid input may let s be empty or let s overflow).
That is not untypical. Processing of valid data is often straightforward and short; handling all the contingencies makes the code convoluted and baroque.

s[] might be only a numeric operand like "+","-" etc.
In this case s[1] have to be '\0' to let s be a proper string.

Related

How does the function getop() from "The C Programming Book" work?

int getop(char *s) {
int i, c;
while((s[0] = c = getch()) == ' ' || c == '\t')
;
s[1] = '\0';
if(!isdigit(c) && c != '.') {
return c;
}
i = 0;
if(isdigit(c))
while(isdigit(s[++i] = c = getch()))
;
if(c == '.')
while(isdigit(s[++i] = c = getch()))
;
s[i] = '\0';
if(!isdigit(c))
ungetch(c);
return NUMBER;
}
I came across this fucntion while working on an example named "Reverse polish calculator".
we can input numbers for calculator operations via this function, but I'm not getting the working of this function. Like.,
if we enter some input like ---->
12.34 11.34 +
From
while((s[0] = c = getch()) == ' ' || c == '\t')
;
s[1] = '\0';
s will contain 1. But from where does the remaining input goes inside s ?
I've gone through this function well and I came to know it's working but then I want to know the deep working, like the complete flow.
Any help is highly appriciated.
edit:-
After testing various inputs I came to the conclusion that what is the need for getch() and ungetch(), I mean yeah they are there to unread character that is not needed but than look at the test cases.,
int getop(char *s) {
int i, c;
while((s[0] = c = getchar()) == ' ' || c == '\t')
;
s[1] = '\0';
if(!isdigit(c) && c != '.') {
return c;
}
i = 0;
if(isdigit(c))
while(isdigit(s[++i] = c = getchar()))
;
if(c == '.')
while(isdigit(s[++i] = c = getchar()))
;
s[i] = '\0';
/* if(!isdigit(c))
ungetch(c);*/
return NUMBER;
}
Here I replaced getch() with getchar() and it still accepted the input
12a 12 -
and the output was absolutely correct and it unread 'a' character as well
144
and so was the case when I was using getch() ?
Lets break it down piece by piece:
while((s[0] = c = getch()) == ' ' || c == '\t') skips all spaces and tabulation from the beginning fo the string. At the end of this loop s[0] contains the first char which is not either of the two mentioned before.
Now if c is something other than . or a digit we simply return that.
if(!isdigit(c) && c != '.') {
return c;
}
If c is not a digit we are done and it's quite right to set s[1] to null!
Otherwise, if c is a digit we read as much digit chars as we can overwriting s from position 1. s[1] was '\0' but it does not matter because s[++i] would be s[1] at the very first iteration of
i = 0;
if(isdigit(c))
while(isdigit(s[++i] = c = getch()))
so s is kept in a consistent status.

query regarding getop() in K&R C

In the function below:
Why it is terminating the string initially by s[1] = '\0';?
after i = 0, why starting to take values from s[1] not from s[0]?
#define NUMBER '0'
#define MAXSIZE 100
char s[MAXSIZE];
/* getop: get next character or numeric operand */
int getop(char s[ ])
{
int i, c;
while ((s[0] = c = getch()) == ' ' || c == '\t')
;
s[1] = '\0';
if (!isdigit(c) && c != '.')
return c; /* not a number */
i = 0;
if (isdigit(c)) /* collect integer part */
while (isdigit(s[++i] = c = getch()))
;
if (c == '.') /* collect fraction part */
while (isdigit(s[++i] = c = getch()))
;
s[i] = '\0';
if (c != EOF)
ungetch(c);
return NUMBER;
}
The function appears to store its result in a pointer to a string, in C style zero terminated. The first getch line stores its result in s[0], and if it's not a digit or the period, it immediately returns. Storing a zero as the 2nd character makes sure the returned string is properly ended -- it contains only one character.
After that initial step you already have one valid character, and it's stored in s[0]. So, all next getch calls need to store from 1 onwards, or it would overwrite the first character entered.

How does `ungetch` work?

I searched everywhere, but I cannot find the answer to this! I'm working on the exercise from the K&R C books with a function they call getop. When it peeks at the next character from the input and sees that it isn't a digit, where does the character get stored when unget is called? I can compile and run the code so I know it works, I just want to know where the character has been stored.
int getop(char s[])
{
int i, c;
while ((s[0] = c = getch()) == ' ' || c == '\t')
;
s[1] = '\0';
if (!isdigit(c) && c != '.')
return c; /* not a number */
i = 0;
if (isdigit(c)) /*collect integer part*/
while (isdigit(s[++i] = c = getch()))
;
if (c == '.') /*collect fraction part*/
while (isdigit(s[++i] = c = getch()))
;
s[i] = '\0';
if (c != EOF)
ungetch(c);
return NUMBER;
}
It doesn't really peek, it reads the char into the stream's buffer. On an unbuffered stream, or when you've read nothing after opening the file or seeking, ungetc() isn't guaranteed to work.

Confusion about calculator program

int getop(char s[])
{
int i = 0, c, next;
/* Skip whitespace */
while((s[0] = c = getch()) == ' ' || c == '\t')
;
s[1] = '\0';
/* Not a number but may contain a unary minus. */
if(!isdigit(c) && enter code herec != '.' && c != '-')
return c;
if(c == '-')
{
next = getch();
if(!isdigit(next) && next != '.')
return c;
c = next;
}
else
c = getch();
while(isdigit(s[++i] = c)) //HERE
c = getch();
if(c == '.') /* Collect fraction part. */
while(isdigit(s[++i] = c = getch()))
;
s[i] = '\0';
if(c != EOF)
ungetch(c);
return NUMBER;
};
what if there is no blank space or tab than what value will s[0] will initialize .......& what is the use of s[1]='\0'
what if there is no blank space or tab than what value will s[0] will intialize
The following loop will continue executing until getch() returns a character that's neither a space nor a tab:
while((s[0] = c = getch()) == ' ' || c == '\t')
;
what is the use of s[1]='\0'
It converts s into a C string of length 1, the only character of which has been read by getch(). The '\0' is the required NUL-terminator.
If there is no space or tab, you're stuck with an infinite loop.
s[1]='\0' is a way of marking the end so functions like strlen()
know when to stop reading through c strings. it's called "Null-Terminating" a string: http://chortle.ccsu.edu/assemblytutorial/Chapter-20/ass20_2.html
while((s[0] = c = getch()) == ' ' || c == '\t')
Read character until its not a tab or space.
s[1] = '\0';
Convert char array s to a proper string in C format (all strings in c must be terminated with a null byte, which is represented by '\0'.

Newbie C Array question over functions

I bought a C book called "The C (ANSI C) PROGRAMMING LANGUAGE" to try and teach myself, well C. Anyhow, the book includes a lot of examples and practices to follow across the chapters, which is nice.
Anyhow, the code below is my answer to the books "count the longest line type of program", the authors are using a for-loop in the function getLine(char s[], int lim). Which allows for a proper display of the string line inside the main() function. However using while won't work - for a reason that is for me unknown, perhaps someone might shed a light on the situation to what my error is.
EDIT: To summarize the above. printf("%s\n", line); won't display anything.
Thankful for any help.
#include <stdio.h>
#define MAXLINE 1024
getLine(char s[], int lim) {
int c, i = 0;
while((c = getchar()) != EOF && c != '\n' && i < lim) {
s[++i] = c;
}
if(c == '\n' && i != 0) {
s[++i] = c;
s[++i] = '\0';
}
return i;
}
main(void) {
int max = 0, len;
char line[MAXLINE], longest[MAXLINE];
while((len = getLine(line,MAXLINE)) > 0) {
if(len > max) {
max = len;
printf("%s\n", line);
}
}
return 0;
}
You have a number of serious bugs. Here's the ones I found and how to fix them.
change your code to postincrement i to avoid leaving the first array member uninitialised, and to avoid double printing the final character:
s[++i] = c;
...
s[++i] = c;
s[++i] = '\0';
to
s[i++] = c;
...
// s[++i] = c; see below
...
s[i++] = '\0';
and fix your EOF bug:
if(c == '\n' && i != 0) {
s[++i] = c;
s[++i] = '\0';
}
to
if(c == '\n')
{
s[i++] = '\n';
}
s[i] = '\0'
Theory
When writing programs that deal with strings, arrays or other vector-type structures it is vitally important that you check the logic of your program. You should do this by hand, and run a few sample cases through it, providing sample inputs to your program and thinking out what happens.
The cases you need to run through it are:
a couple general cases
all the edge cases
In this case, your edge cases are:
first character ever is EOF
first character is 'x', second character ever is EOF
first character is '\n', second character is EOF
first character is 'x', second character is '\n', third character is EOF
a line has equal to lim characters
a line has one less than lim characters
a line has one more than lim characters
Sample edge case
first character is 'x', second character is '\n', third character is EOF
getLine(line[MAXLINE],MAXLINE])
(s := line[MAXLINE] = '!!!!!!!!!!!!!!!!!!!!!!!!...'
c := undef, i := 0
while(...)
c := 'x'
i := 1
s[1] := 'x' => s == '!x!!!!...' <- first bug found
while(...)
c := '\n'
end of while(...)
if (...)
(c== '\n' (T) && i != 0 (T)) = T
i := i + 1 = 2
s[2] = '\n' => s == '!x\n!!!!'
i := i + 1 = 3
s[3] = '\0' => s == '!x\n\0!!!' <- good, it's terminated
return i = 3
(len = i = 3) > 0) = T (the while eval)
if (...)
len (i = 3) > max = F
max = 3 <- does this make sense? is '!x\n' a line 3 chars long? perhaps. why did we keep the '\n' character? this is likely to be another bug.
printf("%s\n", line) <- oh, we are adding ANOTHER \n character? it was definitely a bug.
outputs "!x\n\n\0" <- oh, I expected it to print "x\n". we know why it didn't.
while(...)
getLine(...)
(s := line[MAXLINE] = '!x\n\0!!!!!!!!!!!!!!!!!!!...' ; <- oh, that's fun.
c := undef, i := 0
while(...)
c := EOF
while terminates without executing body
(c == '\n' && i != 0) = F
if body not executed
return i = 0
(len = i = 0 > 0) = F
while terminates
program stops.
So you see this simple process, that can be done in your head or (preferably) on paper, can show you in a matter of minutes whether your program will work or not.
By following through the other edge cases and a couple general cases you will discover the other problems in your program.
It's not clear from your question exactly what problem you're having with getLine (compile error? runtime error?), but there are a couple of bugs in your implementation. Instead of
s[++i] = something;
You should be using the postfix operator:
s[i++] = something;
The difference is that the first version stores 'something' at the index of (i+1), but the second version will store something at the index of i. In C/C++, arrays are indexed from 0, so you need to make sure it stores the character in s[0] on the first pass through your while loop, in s[1] on the second pass through, and so on. With the code you posted, s[0] is never assigned to, which will cause the printf() to print out unintialised data.
The following implementation of getline works for me:
int getLine(char s[], int lim) {
int c;
int i;
i = 0;
while((c = getchar()) != EOF && c != '\n' && i < lim) {
s[i++] = c;
}
if(c == '\n' && i != 0) {
s[i++] = c;
s[i++] = '\0';
}
return i;
}
By doing ++i instead of i++, you are not assigning anything to s[0] in getLine()!
Also, you are unnecesarilly incrementing when assigning '\0' at the end of the loop, which BTW you should always assign, so take it out from the conditional.
Also add return types to the functions (int main and int getLine)
Watch out for the overflow as well - you are assigning to s[i] at the end with a limit of i == lim thus you may be assigning to s[MAXLINE]. This would be a - wait for it - stack overflow, yup.

Resources