Function of (char)getchar() in C programming

Function of (char)getchar() in C programming - c

My friend asked me what is (char)getchar() which he found in some online code and I googled and found 0 results of it being used, I thought the regular usage is just ch = getchar(). This is the code he found, Can anyone explain what is this function?
else if (input == 2)
{
if (notes_counter != LIST_SIZE)
{
printf("Enter header: ");
getchar();
char c = (char)getchar();
int tmp_count = 0;
while (c != '\n' && tmp_count < HEADER)
{
note[notes_counter].header[tmp_count++] = c;
c = (char)getchar();
}
note[notes_counter].header[tmp_count] = '\0';
printf("Enter content: ");
c = (char)getchar();
tmp_count = 0;
while (c != '\n' && tmp_count < CONTENT)
{
note[notes_counter].content[tmp_count++] = c;
c = (char)getchar();
}
note[notes_counter].content[tmp_count] = '\0';
printf("\n");
notes_counter++;
}
}

(char)getchar() is a mistake. Never use it.
getchar returns an int that is either an unsigned char value of a character that was read or is the value of EOF, which is negative. If you convert it to char, you lose the distinction between EOF and some character that maps to the same char value.
The result of getchar should always be assigned to an int object, not a char object, so that these values are preserved, and the result should be tested to see if it is EOF before the program assumes a character has been read. Since the program uses c to store the result of getchar, c should be declared as int c, not char c.
It is possible a compiler issued a warning for c = getchar(); because that assignment implicitly converts an int to a char, which can lose information as mentioned above. (This warning is not always issued by a compiler; it may depend on warning switches used.) The correct solution for that warning is to change c to an int, not to insert a cast to char.
About the conversion: The C standard allows char to be either signed or unsigned. If it is unsigned, then (char) getchar() will convert an EOF returned by getchar() to some non-negative value, which will be the same value as one of the character values. If it is signed, then (char) getchar() will convert some of the unsigned char character values to char in an implementation-defined way, and some of those conversions may produce the same value as EOF.

The code is a typical example of incorrect usage of the getchar() function.
getchar(), and more generally getc(fp) and fgetc(fp) return a byte from the stream as a positive value between 0 and UCHAR_MAX or the special negative value EOF upon error or end of file.
Storing this value into a variable of type char loses information. It makes testing for EOF
unreliable if type char is signed: if EOF has the value (-1) it cannot be distinguished from a valid byte value 255, which most likely gets converted to -1 when stored to a char variable on CPUs with 8-bit bytes
impossible on architectures where type char is unsigned by default, on which all char values are different from EOF.
In this program, the variables receiving the getchar() return value should have type int.
Note also that EOF is not tested in the code fragment, causing invalid input strings such as long sequences of ÿÿÿÿÿÿÿÿ at end of file.
Here is a modified version:
else if (input == 2)
{
if (notes_counter != LIST_SIZE)
{
int c;
// consume the rest of the input line left pending by `scanf()`
// this should be performed earlier in the function
while ((c = getchar()) != EOF && c != '\n')
continue;
printf("Enter header: ");
int tmp_count = 0;
while ((c = getchar()) != EOF && c != '\n') {
if (tmp_count + 1 < HEADER)
note[notes_counter].header[tmp_count++] = c;
}
note[notes_counter].header[tmp_count] = '\0';
printf("Enter content: ");
tmp_count = 0;
while ((c = getchar()) != EOF && c != '\n')
if (tmp_count + 1 < CONTENT)
note[notes_counter].content[tmp_count++] = c;
}
note[notes_counter].content[tmp_count] = '\0';
printf("\n");
notes_counter++;
}
}

Related

Are both these two cleaning buffer methods equivalent?

I was wondering:
do they do exactly the same thing? calling c = getchar on expression is the same as doing it with a do...while loop?
void clrbuf(void)
{
int c;
while ((c = getchar()) != '\n' && c != EOF);
}
void clrbuf(void)
{
int c;
do c = getchar(); while (c != '\n' && c != EOF);
}
edit: c was once typed char, but folks told me int was the appropriate type for it

For starters the variable c should be declared like
int c;
because if the type char behaves as the type unsigned char then this condition
c != EOF
will be always true.
According to the C Standard (7.21 Input/output <stdio.h>)
EOF
which expands to an integer constant expression, with type int and a negative value, that is returned by several functions to indicate
end-of-file, that is, no more input from a stream;
So if the type char behaves as the type unsigned char (this depends on compiler options) then the value stored in the variable c after the integer promotion to the type int will be still a non-negative value.
The first while loop
while ((c = getchar()) != '\n' && c != EOF);
may be rewritten using the comma operator like
while ( c = getchar(), c != '\n' && c != EOF );
that is in fact it consists of two parts: the assignment expression c = getchar() and the condition c != '\n' && c != EOF.
As you can see it is equivalent to the do-while statement
do c = getchar(); while (c != '\n' && c != EOF);
However the first while loop
while ((c = getchar()) != '\n' && c != EOF);
is more expressive and clear.

Counting the number of lines in a .txt file in c

I have a processes.txt file that contains details about incoming processes like so,
0 4 96 30
3 2 32 40
5 1 100 20
20 3 4 30
I wanted to find the number of lines in this file. How can that be done?
I tried this code, but it always returns the number of lines as 0
char c;
int count = 0;
// fp is the pile pointer
for (c = getc(fp); c != EOF; c = getc(fp))
if (c == '\n') // Increment count if this character is newline
count = count + 1;

Apart from the char that should be an int your code is more or less fine. The problem is somewhere in the code you didn't show.
This works:
#include <stdio.h>
int main() {
FILE* fp = fopen("processes.txt", "r");
if (fp == NULL)
{
printf("Could not open file.");
return 1;
}
int c; // this must be an int
int count = 0;
for (c = getc(fp); c != EOF; c = getc(fp))
if (c == '\n') // Increment count if this character is newline
count = count + 1;
printf("The file has %d line(s)\n", count);
fclose(fp);
}
However if the last line of the file does not end with a \n, it is not counted.

Please, read How to create a minimal, complete and verifiable example.
In order to test your program snippet, I first had to complete your fragment of code in order to make it compilable. Probably your error has gone with that modifications, as my run of it shows (over your input text) this output:
pru.c
#include <stdio.h>
int main()
{
char c;
int count = 0;
FILE *fp = stdin; // most probably your error is
// related to this initialization.
// fp is the pile pointer
for (c = getc(fp); c != EOF; c = getc(fp))
if (c == '\n') // Increment count if this character is newline
count = count + 1;
printf("%d\n", count);
return 0;
}
and running it:
$ pru <<EOF
0 4 96 30
3 2 32 40
5 1 100 20
20 3 4 30
EOF
4
$ _
Which is the correct answer.
Despite of this, your program fragment, shows a non visible error, as you have been told in the comments to your question: type of c variable should be int and not char, but why?
Because char is the type you want to receive, all available values are possible, so to indicate that some special condition has been detected in your file (the end of the data in the stream, or EOF is not one of those values, but a special condition) one extra value is needed, so making the type char insufficient to include all possible return values from fgetc(3). This is the reason to make fgetc(3) function to return an int.
Check the documentation of fgetc(3) as your program works almost fine, while you have to be given a reason of why:
When the program reads a character, it is mapped into the int values 0 to 255, so all different bytes convert as positive integer values, while normally (almost every implementation does) EOF is mapped into the integer value -1. What is happening here is that all your values are converted into a char, making EOF to be mapped into one of those 0 to 256 values (which one is dependent on the implementation, but normally it is the value 255 ---or -1 if char happens to be signed) so:
in case your char type is represented as a two's complement type (signed) your values 0 to 255 are mapped into 0 to 127 and -128 to -1, and the EOF value is mapped to some of them (mostly -1).
in case your char type is represented as an unsigned type, your values 0 to 255 are mapped into 0 to 255 and the EOF value is mapped to one of them (most probably 255)
it doesn't matter which value the EOF is converted to, as you make your comparison in a coherent type system, so the converted char value is compared with the converted EOF value making that EOF is converted into the converted value of EOF. But this makes another char value to happen to show the same behaviour, making that one such charater on input will be interpreted as EOF, and will make your program to stop prematurely.
In both cases above, if a byte with the same mapped-to value of EOF is input, your program will finish, believing that it has reached the end of the file, and your count will be erroneous. This is not the case here, but you can get a surprise with one file that has such a character.
So your final program (corrected) would be:
#include <stdio.h>
int main()
{
int c;
int count = 0;
FILE *fp = stdin;
// fp is the pile pointer
for (c = getc(fp); c != EOF; c = getc(fp))
if (c == '\n') // Increment count if this character is newline
count = count + 1;
printf("%d\n", count);
return 0;
}
Before terminating, I'll recommend to use a while loop, as it is a frequently used idiom in C to produce more compact form of your loop:
#include <stdio.h>
int main()
{
int c;
long count = 0;
FILE *fp = stdin; /* probably you dont have intialized this
* field in your code, but who knows, if
* you have not posted a complete
* sample */
// fp is the pile pointer
while ((c = getc(fp)) != EOF)
if (c == '\n') // Increment count if this character is newline
count++; // this is another frequently used idiom :)
printf("%d\n", count);
return 0;
}

Maybe your file is at the end or in error when you do this?? And you need to start at the beginning
int c; // c must be int
int count = 0;
// fp is the pile pointer
rewind(fp); // back to beginning, clear error
for (c = getc(fp); c != EOF; c = getc(fp))
if (c == '\n') // Increment count if this character is newline
count = count + 1;

Understanding K&R's getint() (Chapter 5: Pointers & Arrays, Exercise 1)?

I'm a novice programmer who's self-studying C through K&R. I don't understand the design of their function getint(), which converts a string of digits into the integer it represents. I'll ask my question then post the code below.
If getch() returns a non-digit character that's not a '-' or '+', it pushes this non-digit character back onto the input with ungetch(), and returns 0. So, if getint() is called again, getch() will just return that same non-digit character that was pushed back, so ungetch() will push it back again, etc. The way I understand it (which could be wrong), the function breaks completely if it's passed any non-digit character.
The exercise doesn't have you fix this. It asks to fix the fact that a '-' or '+' followed by a non-digit is a valid representation of 0.
What exactly am I missing here? Did they design getint() to make an infinite loop if the input is anything other than 0-9? Why?
Here's their code for getint() [edit] with main calling getint():
int getint(int *);
int main()
{
int n, array[BUFSIZE];
for (n = 0; n < BUFSIZE && getint(&array[n]) != EOF; n++)
;
return 0;
}
int getch(void);
void ungetch(int);
int getint(int *pn)
{
int c, sign;
while (isspace(c = getch())
;
if (!isdigit(c) && c != EOF && c != '+' && c != '-') {
ungetch(c); //this is what i don't understand
return 0;
}
sign = (c == '-') ? -1 : 1;
if (c == '-' || c == '+')
c = getch();
for (*pn = 0; isdigit(c); c = getch())
*pn = 10 * *pn + (c - '0');
*pn *= sign;
if (c != EOF)
ungetch(c);
return c;
}
int buf[BUFSIZE];
int bufp = 0;
int getch(void)
{
return (bufp > 0) ? buf[--bufp] : getchar();
}
void ungetch(int c)
{
if (bufp >= BUFSIZE)
printf("ungetch: can't push character\n");
else
buf[bufp++] = c;
}

As it is currently written, getint() function is trying to read an integer from user input and puts it into *pn.
If user inputs a positive or negative number(with a sign or without it), *pn gets updated to that number and getint() returns some positive number (the next character after the number).
If user inputs a non valid number, *pn is not updated and getint() returns 0 (meaning it failed).
the function breaks completely if it's passed any non-digit character.
That's right. All subsequent calls to getint() will fail as the last character was passed to ungetch(). What you understand is correct.
But this is how getint() is supposed to handle garbage input. It'll simply reject it and return 0 (meaning it failed). It is not the responsibility of getint() to take care of non-integer input and prepare fresh input for next read. It is not a bug.
The only bug is that a '-' or '+' followed by a non-digit is currently being considered as a valid representation of 0. Which is left to reader as an exercise.
If user inputs EOF, *pn is not updated (multiplied by 1) and getint() returns EOF.

The reasoning is similar to scanf not consuming characters that don't match the conversion specifier - you don't want to consume something that isn't part of a valid integer, but may be part of a valid string or other type of input. getint has no way of knowing whether the input it rejects is part of an otherwise valid non-numeric input, so it has to leave the input stream the same way it found it.

Testing chars and ints within a char variable

I'm having trouble finding out how to set up a loop where i enter input and then
stop the input by pressing 'e' or 'E'. The input entered is integers but needs to be stopped with a character. That is where i get lost. I have seen a bunch of information about using ascii conversions but i dont know how efficient that would be. This code is broken but it is as far as i could get. Any information would be helpful.
int main(void)
{
char num;
int sub;
while (sub != 'e' || sub != 'E') {
scanf("%d", &num);
sub = &num;
printf("%d", num);
}
return 0;
}

Simple.
#include <stdio.h>
#include <ctype.h>
int main(void) {
char c = getchar();
int num;
while (c != 'e' || c != 'E') {
if (isdigit(c))
num = c - '0';
c = getchar();
}
return 0;
}
But you don't have to use an ascii character as a way to stop input. You can use EOF which is -1. It is Ctrl-D on UNIX systems and Ctrl-Z on Windows.
int c;
while ((c = getchar()) != EOF)

A direct way to distinguish between an input of int, 'e' and , 'E' is to read a line of user input with fgets() and then parse it.
#define LINE_SZ 80
char buf[LINE_SZ];
while (fgets(buf, sizeof buf, stdin) && buf[0] != 'e' && buf[0] != 'E') {
if (sscanf(buf, "%d", &num) != 1) {
Handle_other_non_int_input();
}
sub = &num;
printf("%d", num);
}

As noted in the comments, (sub != 'e' || sub != 'E') is always true. If sub can never be e and E at the same time.
Note that sub is an int and not an integer pointer (int *).
The line sub = &num; assigns sub with num's address.
And the value of sub is used in the control expression of the while loop before it is initialised. sub has garbage value at that point which is indeterminate. You have to initalise it with some value before using it.
Do
int num, rv;
while( 1 )
{
rv=scanf("%d", &num);
if(rv==0)
{
if( (num=getchar())=='e' || num=='E' )
{
break;
}
else
{
while(getchar()!='\n');
continue;
}
}
printf("\n%d", num);
}
A value is read into num by scanf() whose return value is stored in rv.
scanf() returns the number of successful assignments which in this case should be 1 if an integer value was read into num since %d is the format specifier.
If rv is 1, it is a number and is printed. Otherwise it could be a character which won't read by the scanf() and would remain unconsumed in the input buffer. The first byte of this data is read by the getchar() and if this is e or E, the loop is exited but otherwise the input buffer is cleared till a \n is encountered and the next iteration of the loop is done without going into the part where the printing takes place.

difference between scanf("%c" , ..) and getchar()

I always think scanf("%c" , &addr); is equal to getchar() before I test this:
#include<stdio.h>
int main()
{
int i;
scanf("%c",&i);
printf("%d\n", i);
if(i == EOF)
printf("EOF int type and char input\n");
i =getchar();
printf("%d\n", i);
if(i == EOF)
printf("EOF int type and char input\n");
}
I got output when I use "Ctrl+D" twice:
-1217114112
-1
EOF int type and char input
Since EOF is -1 in int type ,I also try use scanf("%d",&i); replace scanf("%c",&i) , just get the same output.
I got confused. Can anybody explain this for me?
----------------------------------EDIT-----------------------------------------------
I want to know the behavior of scanf("%c",i) of Ctrl+D , I do test:
#include<stdio.h>
int main()
{
int i;
int j;
j = scanf("%c",&i);
printf("%c\n", i);
printf("%d\n", j);
if(i == EOF)
printf("EOF int type and char input");
i =getchar();
printf("%d\n", i);
if(i == EOF)
printf("EOF int type and char input");
}
OutPut:
k // If the scanf set 1 byte in i , why here print 'k' ?
-1
-1
EOF int type and char input

Your comparison does not fully set i as it involves Undefined Behavior (UB).
int i; // the value of i could be anything
scanf("%c",&i); // At most, only 1 byte of i is set, the remaining bytes are still unknown.
printf("%d\n", i);// Your are printing 'i' whose value is not fully determined.
Had you tried
char ch;
int y = scanf("%c",&ch);
printf("%d\n", ch);
if(ch == EOF)
You would potentially make a match even though the input was not EOF. Had you scanned in a char with the value of 255, the char would take on the 2s compliment 8-bit value of -1. The comparison would sign extend the 8-bit -1 to match the int size and you would match -1.
(Assumptions: 2s compliment integers, 8-bit byte, EOF == -1, char is signed).
The correct EOF test is
int y = scanf("%c",&ch);
if (y == EOF)
Note: getchar() & scanf() return EOF implies End-of-file or I/O error. A subsequent check of ferror(stdin) distinguishes this.

The first value is probably undefined behavior. You can't rely on i having a value unless scanf() returns 1.
With scanf() in particular, you seem to be confusing the scanned value (the conversion of characters according to a format specifier in the first argument) with the return value of the function call.
With getchar(), of course, this distinction doesn't exist since it only has a return value.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Function of (char)getchar() in C programming - c

Related

Are both these two cleaning buffer methods equivalent?

Counting the number of lines in a .txt file in c

Understanding K&R's getint() (Chapter 5: Pointers & Arrays, Exercise 1)?

Testing chars and ints within a char variable

difference between scanf("%c" , ..) and getchar()

Categories

Resources