Understanding K&R's getint() (Chapter 5: Pointers & Arrays, Exercise 1)? - c

I'm a novice programmer who's self-studying C through K&R. I don't understand the design of their function getint(), which converts a string of digits into the integer it represents. I'll ask my question then post the code below.
If getch() returns a non-digit character that's not a '-' or '+', it pushes this non-digit character back onto the input with ungetch(), and returns 0. So, if getint() is called again, getch() will just return that same non-digit character that was pushed back, so ungetch() will push it back again, etc. The way I understand it (which could be wrong), the function breaks completely if it's passed any non-digit character.
The exercise doesn't have you fix this. It asks to fix the fact that a '-' or '+' followed by a non-digit is a valid representation of 0.
What exactly am I missing here? Did they design getint() to make an infinite loop if the input is anything other than 0-9? Why?
Here's their code for getint() [edit] with main calling getint():
int getint(int *);
int main()
{
int n, array[BUFSIZE];
for (n = 0; n < BUFSIZE && getint(&array[n]) != EOF; n++)
;
return 0;
}
int getch(void);
void ungetch(int);
int getint(int *pn)
{
int c, sign;
while (isspace(c = getch())
;
if (!isdigit(c) && c != EOF && c != '+' && c != '-') {
ungetch(c); //this is what i don't understand
return 0;
}
sign = (c == '-') ? -1 : 1;
if (c == '-' || c == '+')
c = getch();
for (*pn = 0; isdigit(c); c = getch())
*pn = 10 * *pn + (c - '0');
*pn *= sign;
if (c != EOF)
ungetch(c);
return c;
}
int buf[BUFSIZE];
int bufp = 0;
int getch(void)
{
return (bufp > 0) ? buf[--bufp] : getchar();
}
void ungetch(int c)
{
if (bufp >= BUFSIZE)
printf("ungetch: can't push character\n");
else
buf[bufp++] = c;
}

As it is currently written, getint() function is trying to read an integer from user input and puts it into *pn.
If user inputs a positive or negative number(with a sign or without it), *pn gets updated to that number and getint() returns some positive number (the next character after the number).
If user inputs a non valid number, *pn is not updated and getint() returns 0 (meaning it failed).
the function breaks completely if it's passed any non-digit character.
That's right. All subsequent calls to getint() will fail as the last character was passed to ungetch(). What you understand is correct.
But this is how getint() is supposed to handle garbage input. It'll simply reject it and return 0 (meaning it failed). It is not the responsibility of getint() to take care of non-integer input and prepare fresh input for next read. It is not a bug.
The only bug is that a '-' or '+' followed by a non-digit is currently being considered as a valid representation of 0. Which is left to reader as an exercise.
If user inputs EOF, *pn is not updated (multiplied by 1) and getint() returns EOF.

The reasoning is similar to scanf not consuming characters that don't match the conversion specifier - you don't want to consume something that isn't part of a valid integer, but may be part of a valid string or other type of input. getint has no way of knowing whether the input it rejects is part of an otherwise valid non-numeric input, so it has to leave the input stream the same way it found it.

Related

Function of (char)getchar() in C programming

My friend asked me what is (char)getchar() which he found in some online code and I googled and found 0 results of it being used, I thought the regular usage is just ch = getchar(). This is the code he found, Can anyone explain what is this function?
else if (input == 2)
{
if (notes_counter != LIST_SIZE)
{
printf("Enter header: ");
getchar();
char c = (char)getchar();
int tmp_count = 0;
while (c != '\n' && tmp_count < HEADER)
{
note[notes_counter].header[tmp_count++] = c;
c = (char)getchar();
}
note[notes_counter].header[tmp_count] = '\0';
printf("Enter content: ");
c = (char)getchar();
tmp_count = 0;
while (c != '\n' && tmp_count < CONTENT)
{
note[notes_counter].content[tmp_count++] = c;
c = (char)getchar();
}
note[notes_counter].content[tmp_count] = '\0';
printf("\n");
notes_counter++;
}
}
(char)getchar() is a mistake. Never use it.
getchar returns an int that is either an unsigned char value of a character that was read or is the value of EOF, which is negative. If you convert it to char, you lose the distinction between EOF and some character that maps to the same char value.
The result of getchar should always be assigned to an int object, not a char object, so that these values are preserved, and the result should be tested to see if it is EOF before the program assumes a character has been read. Since the program uses c to store the result of getchar, c should be declared as int c, not char c.
It is possible a compiler issued a warning for c = getchar(); because that assignment implicitly converts an int to a char, which can lose information as mentioned above. (This warning is not always issued by a compiler; it may depend on warning switches used.) The correct solution for that warning is to change c to an int, not to insert a cast to char.
About the conversion: The C standard allows char to be either signed or unsigned. If it is unsigned, then (char) getchar() will convert an EOF returned by getchar() to some non-negative value, which will be the same value as one of the character values. If it is signed, then (char) getchar() will convert some of the unsigned char character values to char in an implementation-defined way, and some of those conversions may produce the same value as EOF.
The code is a typical example of incorrect usage of the getchar() function.
getchar(), and more generally getc(fp) and fgetc(fp) return a byte from the stream as a positive value between 0 and UCHAR_MAX or the special negative value EOF upon error or end of file.
Storing this value into a variable of type char loses information. It makes testing for EOF
unreliable if type char is signed: if EOF has the value (-1) it cannot be distinguished from a valid byte value 255, which most likely gets converted to -1 when stored to a char variable on CPUs with 8-bit bytes
impossible on architectures where type char is unsigned by default, on which all char values are different from EOF.
In this program, the variables receiving the getchar() return value should have type int.
Note also that EOF is not tested in the code fragment, causing invalid input strings such as long sequences of ÿÿÿÿÿÿÿÿ at end of file.
Here is a modified version:
else if (input == 2)
{
if (notes_counter != LIST_SIZE)
{
int c;
// consume the rest of the input line left pending by `scanf()`
// this should be performed earlier in the function
while ((c = getchar()) != EOF && c != '\n')
continue;
printf("Enter header: ");
int tmp_count = 0;
while ((c = getchar()) != EOF && c != '\n') {
if (tmp_count + 1 < HEADER)
note[notes_counter].header[tmp_count++] = c;
}
note[notes_counter].header[tmp_count] = '\0';
printf("Enter content: ");
tmp_count = 0;
while ((c = getchar()) != EOF && c != '\n')
if (tmp_count + 1 < CONTENT)
note[notes_counter].content[tmp_count++] = c;
}
note[notes_counter].content[tmp_count] = '\0';
printf("\n");
notes_counter++;
}
}

getchar not working properly after using scanf in C

I am writing this code:
int b;
char c;
scanf("%d", &b);
while((c = getchar()) != EOF) {
if(c >= 9 || c < 0) {
printf("Invalid number!\n");
exit(0);
}
}
When I assign b, automatically c is equal to b.
For example, if my input for b is 10, it automatically goes into the if-statement and exits the code.
Does anyone know why?
Finding problems and solving them
You have plenty of errors in the code that are listed below.
The getchar(3) says:
getchar() is equivalent to getc(stdin).
The prototype of getc() is:
int getc(FILE *stream);
That is, the getchar() returns an integer (from unsigned char cast). Thus, we need to change the type from char to int to accept its return value correctly.
Note that EOF is not a valid unsigned char. It expands to signed int -1.
Never ignore the return value of the scanf(3). It returns the number of correctly passed arguments. In this case, to make the code reliable, we should put:
if (scanf("%d", &b) != 1) {
fprintf(stderr, "Value must be an integer.\n");
return EXIT_FAILURE;
}
There is a semantic error in the condition:
if (c >= 9 || c < 0)
^^____________ logically, 9 is a valid one digit number
so removing '=' from here makes more sense
One notable thing is that the condition and the type of the comparator – both should be changed. See the next step.
The fixed loop should look like:
while ((c = getchar()) != EOF) {
if (c == '\n') // Since c = getchar() can be '\n' too
continue; // so, better ignore
if (c >= '0' && c <= '9') // Change the loop like this
printf("Valid number %d %c.\n", c, c);
else
printf("Invalid number.\n");
}
Sample test case output
1
10 // --- Note: 10 are two chars for two getchars
Valid number 49 1.
Valid number 48 0.
3
Valid number 51 3.
9
Valid number 57 9.
-
Invalid number.
a
Invalid number.
<
Invalid number.
.
Invalid number.

read ints from standard input until \n is found

I'm trying to make a function that reads ints from stdin. it has to read until a certain amount of numbers is read (count in example below), or until it finds a '\n'.
Since as far as I am aware scanf (with %d format specifier) ignores newlines, I used getchar and converted the character into the number it should be.
this works but only for 1 digit numbers.
is there any better way to achieve this?
This is my code:
char num = getchar();
while (num != '\n' && count < 9) {
//boring operations that don't matter
num = getchar()
}
Reading via fgets() is better. Continue reading if your must use scanf().
To use scanf("%d",...), we need extra care to read a line. As "%d" consumes leading white-space, including '\n', we need more code to look for white-space and test if a '\n' is found.
int count = 0;
while (count < 9) {
// Read leading spaces
int ch;
while (isspace((c = getchar())) && c != '\n') {
;
}
if (c == '\n' || c == EOF) break; // We are done reading
ungetc(c, stdin); // put character back
int some_int;
if (scanf("%d", &some_int) == 1) {
printf("Integer found %d\n", some_int);
count++;
} else {
// Non-numeric input, consume at least 1 character.
getchar();
}
}
If numeric text is outside the range of int, the above use of "%d" is undefined behavior. For robust code, use fgets().
The %d conversion specifier only ignores leading whitespace. So you can do something like:
#include <stdio.h>
#include <stdlib.h>
int
main(int argc, char **argv)
{
int n = argc > 1 ? strtol(argv[1], NULL, 10) : 10;
int x;
while( n-- && scanf("%d%*[ \t]", &x) == 1 ){
printf("Read: %d\n", x);
int c = getchar();
if( c == EOF || c == '\n' ){
break;
}
ungetc(c, stdin);
}
return 0;
}
However, this will probably not handle a stream like 10 5 x in a reasonable way. You'll need more logic on the first non-whitespace after an integer to handle that (maybe just do if( c == EOF || ! isdigit(c) ){ break; }). Parsing data with scanf if fickle (it really never has a purpose outside of university exercises). Just use fgets and strtol.
scanf() doesn't ignore \n
#include <stdio.h>
#include <stddef.h>
int main(int argc , char *argv[])
{
int b;
char c;
scanf("%d%c",&b,&c);
if(c == '\n') printf("and then " );
}
Someone posted an answer and then deleted but it was the perfect solution for my problem, so all credit to the original author.
The solution was reading normally with scanf and afterwards,with getchar, checking if it was \n or EOF. If it was break out of the cycle, if it wasn't, "unread" with ungetc so you can scanf the number in the next iteration.
So my final code looks like this:
while(scanf("%d",&num) == 1 && count<9){
//boring operations
c = getchar();
if (c == EOF || c == '\n') break;
if (ungetc(c,stdin) == EOF) break;
}
NOTE: like Andrew Henle pointed out in the replies, this doesn't work unless it is guaranteed that there isn't any space between the digits and the newline

How do I return both char and int in one fprintf statement in C

I want to get a char and check whether if it is Upper or lower case A-Z(return A as 1 to Z as 26) and if it is number 0-9 return the number itself. If not then return the char itself. Im having a hard time to return the char itself from just one printf statement
#include <stdio.h>
int fun(char c0);
int main(){
char c;
while ( (c = getchar()) != EOF) {
if (c != '\n')
fprintf(stdout, "%d\n", fun(c));
}
return 0;
}
int fun(char c0){
if(c0 >= '0') { // check from the the lowest ascii, 0-9.
if(c0 <= '9') { // trigger means c0 is 0-9
return c0-'0';
}
else if(c0 <= 'z') { //check from the highest ascii, a-z
if(c0 >= 'a') { //trigger means c0 is a-z
return c0-'`';
}
else if(c0 <= 'Z') { // check lastly A-Z
if(c0 >= 'A')
return c0-'#';
}
}
}
return c0;
}
EDIT: changed some magical numbers and added a while loop.
Fundamentally, the challenge you're running into is that the return value from your function needs to be interpreted as either
a character value, if it's a letter, or
a number, if it's a character representing a number.
You can return a single int that will hold the result, with the intent that 0 - 9 means "I'm a number" and anything else means "the character with that code," but if so you'll need an if/else statement around your printf to determine how that should be printed. I suspect you'll just need to have two different printf statements.
Other options include doing things like
returning a string to print out, though now you need to manage the memory for it, or
having a second function that determines what printf specifier to use, which seems really hacky.
On an unrelated note, for readability purposes, don't use ASCII character codes in your program. For what you're doing, the lovely functions isalpha, isdigit, etc. from <cctype> not only do everything you're doing in a more compact way, but they do so more portably, since some (older) systems don't even use ASCII encoding.

Putting numbers separated by a space into an array

I want to have a user enter numbers separated by a space and then store each value as an element of an array. Currently I have:
while ((c = getchar()) != '\n')
{
if (c != ' ')
arr[i++] = c - '0';
}
but, of course, this stores one digit per element.
If the user was to type:
10 567 92 3
I was wanting the value 10 to be stored in arr[0], and then 567 in arr[1] etc.
Should I be using scanf instead somehow?
There are several approaches, depending on how robust you want the code to be.
The most straightforward is to use scanf with the %d conversion specifier:
while (scanf("%d", &a[i++]) == 1)
/* empty loop */ ;
The %d conversion specifier tells scanf to skip over any leading whitespace and read up to the next non-digit character. The return value is the number of successful conversions and assignments. Since we're reading a single integer value, the return value should be 1 on success.
As written, this has a number of pitfalls. First, suppose your user enters more numbers than your array is sized to hold; if you're lucky you'll get an access violation immediately. If you're not, you'll wind up clobbering something important that will cause problems later (buffer overflows are a common malware exploit).
So you at least want to add code to make sure you don't go past the end of your array:
while (i < ARRAY_SIZE && scanf("%d", &a[i++]) == 1)
/* empty loop */;
Good so far. But now suppose your user fatfingers a non-numeric character in their input, like 12 3r5 67. As written, the loop will assign 12 to a[0], 3 to a[1], then it will see the r in the input stream, return 0 and exit without saving anything to a[2]. Here's where a subtle bug creeps in -- even though nothing gets assigned to a[2], the expression i++ still gets evaluated, so you'll think you assigned something to a[2] even though it contains a garbage value. So you might want to hold off on incrementing i until you know you had a successful read:
while (i < ARRAY_SIZE && scanf("%d", &a[i]) == 1)
i++;
Ideally, you'd like to reject 3r5 altogether. We can read the character immediately following the number and make sure it's whitespace; if it's not, we reject the input:
#include <ctype.h>
...
int tmp;
char follow;
int count;
...
while (i < ARRAY_SIZE && (count = scanf("%d%c", &tmp, &follow)) > 0)
{
if (count == 2 && isspace(follow) || count == 1)
{
a[i++] = tmp;
}
else
{
printf ("Bad character detected: %c\n", follow);
break;
}
}
If we get two successful conversions, we make sure follow is a whitespace character - if it isn't, we print an error and exit the loop. If we get 1 successful conversion, that means there were no characters following the input number (meaning we hit EOF after the numeric input).
Alternately, we can read each input value as text and use strtol to do the conversion, which also allows you to catch the same kind of problem (my preferred method):
#include <ctype.h>
#include <stdlib.h>
...
char buf[INT_DIGITS + 3]; // account for sign character, newline, and 0 terminator
...
while(i < ARRAY_SIZE && fgets(buf, sizeof buf, stdin) != NULL)
{
char *follow; // note that follow is a pointer to char in this case
int val = (int) strtol(buf, &follow, 10);
if (isspace(*follow) || *follow == 0)
{
a[i++] = val;
}
else
{
printf("%s is not a valid integer string; exiting...\n", buf);
break;
}
}
BUT WAIT THERE'S MORE!
Suppose your user is one of those twisted QA types who likes to throw obnoxious input at your code "just to see what happens" and enters a number like 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890 which is obviously too large to fit into any of the standard integer types. Believe it or not, scanf("%d", &val) will not yak on this, and will wind up storing something to val, but again it's an input you'd probably like to reject outright.
If you only allow one value per line, this becomes relatively easy to guard against; fgets will store a newline character in the target buffer if there's room, so if we don't see a newline character in the input buffer then the user typed something that's longer than we're prepared to handle:
#include <string.h>
...
while (i < ARRAY_SIZE && fgets(buf, sizeof buf, stdin) != NULL)
{
char *newline = strchr(buf, '\n');
if (!newline)
{
printf("Input value too long\n");
/**
* Read until we see a newline or EOF to clear out the input stream
*/
while (!newline && fgets(buf, sizeof buf, stdin) != NULL)
newline = strchr(buf, '\n');
break;
}
...
}
If you want to allow multiple values per line such as '10 20 30', then this gets a bit harder. We could go back to reading individual characters from the input, and doing a sanity check on each (warning, untested):
...
while (i < ARRAY_SIZE)
{
size_t j = 0;
int c;
while (j < sizeof buf - 1 && (c = getchar()) != EOF) && isdigit(c))
buf[j++] = c;
buf[j] = 0;
if (isdigit(c))
{
printf("Input too long to handle\n");
while ((c = getchar()) != EOF && c != '\n') // clear out input stream
/* empty loop */ ;
break;
}
else if (!isspace(c))
{
if (isgraph(c)
printf("Non-digit character %c seen in numeric input\n", c);
else
printf("Non-digit character %o seen in numeric input\n", c);
while ((c = getchar()) != EOF && c != '\n') // clear out input stream
/* empty loop */ ;
break;
}
else
a[i++] = (int) strtol(buffer, NULL, 10); // no need for follow pointer,
// since we've already checked
// for non-digit characters.
}
Welcome to the wonderfully whacked-up world of interactive input in C.
Small change to your code: only increment i when you read the space:
while ((c = getchar()) != '\n')
{
if (c != ' ')
arr[i] = arr[i] * 10 + c - '0';
else
i++;
}
Of course, it's better to use scanf:
while (scanf("%d", &a[i++]) == 1);
providing that you have enough space in the array. Also, be careful that the while above ends with ;, everything is done inside the loop condition.
As a matter of fact, every return value should be checked.
scanf returns the number of items successfully scanned.
Give this code a try:
#include <stdio.h>
int main()
{
int arr[500];
int i = 0;
int sc = 0; //scanned items
int n = 3; // no of integers to be scanned from the single line in stdin
while( sc<n )
{
sc += scanf("%d",&arr[i++]);
}
}

Resources