How backspace can actually delete the string in getchar() loop - c

#include <stdio.h>
#include <conio.h>
#define ENTER_KEY '\n'
#define NULL_TERMINATOR '\0'
int main()
{
char name[100], input;
int counter = 0;
while(input != ENTER_KEY)
{
input = getchar();
name[counter] = input;
counter++;
}
counter--;
name[counter] = NULL_TERMINATOR;
printf("%s", name);
return 0;
}
If I write something, it should continuously saved in the name Array. And the counter should go up on every character I enter. But if I press Backspace, it looks like it makes the counter decreased. Because for example if I write "abcdef" and press backspace 3 times and change that to "abcxyz", and then press Enter. It prints "abcxyz".

It depends on the console driver. On most systems (at least Unix-like in line mode and in Windows console), the program does not receive the characters at the moment they are typed but the system prepares a line (up to the newline character) and sends the full line to the program.
In that case, the backspace if often used to edit that console buffer, meaning that the characters erased are actually removed before being handed to the program. So if you type abcdef<backspace><backspace><backspace>xyz<Return> the program will receive the following string: "abcxyz\n".
Beware, in a GUI program or in fullscreen text mode program like emacs or vi, the system is in raw mode (Unix language) and each character is received when it is typed. In that case, the program has to manage the input and erase its own character array when it receives a <backspace>.

Related

Ending a Loop with EOF (without enter)

i am currently trying to end a while loop with something like this:
#include <stdio.h>
int main()
{
while(getchar() != EOF)
{
if( getchar() == EOF )
break;
}
return 0;
}
When i press CTRL+D on my Ubuntu, it ends the loop immediately. But on Windows i have to press CTRL+Z and then press ENTER to close the loop. Can i get rid of the ENTER on Windows?
The getchar behavior
For linux the EOF char is written with ctrl + d, while on Windows it is written by the console when you press enter after changing an internal status of the CRT library through ctrl + z (this behaviour is kept for retrocompatibility with very old systems). If I'm not wrong it is called soft end of file. I don't think you can bypass it, since the EOF char is actually consumed by your getchar when you press enter, not when you press ctrl + z.
As reported here:
In Microsoft's DOS and Windows (and in CP/M and many DEC operating systems), reading from the terminal will never produce an EOF. Instead, programs recognize that the source is a terminal (or other "character device") and interpret a given reserved character or sequence as an end-of-file indicator; most commonly this is an ASCII Control-Z, code 26. Some MS-DOS programs, including parts of the Microsoft MS-DOS shell (COMMAND.COM) and operating-system utility programs (such as EDLIN), treat a Control-Z in a text file as marking the end of meaningful data, and/or append a Control-Z to the end when writing a text file. This was done for two reasons:
Backward compatibility with CP/M. The CP/M file system only recorded the lengths of files in multiples of 128-byte "records", so by convention a Control-Z character was used to mark the end of meaningful data if it ended in the middle of a record. The MS-DOS filesystem has always recorded the exact byte-length of files, so this was never necessary on MS-DOS.
It allows programs to use the same code to read input from both a terminal and a text file.
Other information are also reported here:
Some modern text file formats (e.g. CSV-1203[6]) still recommend a trailing EOF character to be appended as the last character in the file. However, typing Control+Z does not embed an EOF character into a file in either MS-DOS or Microsoft Windows, nor do the APIs of those systems use the character to denote the actual end of a file.
Some programming languages (e.g. Visual Basic) will not read past a "soft" EOF when using the built-in text file reading primitives (INPUT, LINE INPUT etc.), and alternate methods must be adopted, e.g. opening the file in binary mode or using the File System Object to progress beyond it.
Character 26 was used to mark "End of file" even if the ASCII calls it Substitute, and has other characters for this.
If you modify your code like that:
#include <stdio.h>
int main() {
while(1) {
char c = getchar();
printf("%d\n", c);
if (c == EOF) // tried with also -1 and 26
break;
}
return 0;
}
and you test it, on Windows you will see that the EOF (-1) it is not written in console until you press enter. Beore of that a ^Z is printed by the terminal emulator (I suspect). From my test, this behavior is repeated if:
you compile using the Microsoft Compiler
you compile using GCC
you run the compiled code in CMD window
you run the compiled code in bash emulator in windows
Update using Windows Console API
Following the suggestion of #eryksun, I successfully written a (ridiculously complex for what it can do) code for Windows that changes the behavior of conhost to actually get the "exit when pressing ctrl + d". It does not handle everything, it is only an example. IMHO, this is something to avoid as much as possible, since the portability is less than 0. Also, to actually handle correctly other input cases a lot more code should be written, since this stuff detaches the stdin from the console and you have to handle it by yourself.
The methods works more or less as follows:
get the current handler for the standard input
create an array of input records, a structure that contains information about what happens in the conhost window (keyboard, mouse, resize, etc.)
read what happens in the window (it can handle the number of events)
iterate over the event vector to handle the keyboard event and intercept the required EOF (that is a 4, from what I've tested) for exiting, or prints any other ascii character.
This is the code:
#include <windows.h>
#include <stdio.h>
#define Kev input_buffer[i].Event.KeyEvent // a shortcut
int main(void) {
HANDLE h_std_in; // Handler for the stdin
DWORD read_count, // number of events intercepted by ReadConsoleInput
i; // iterator
INPUT_RECORD input_buffer[128]; // Vector of events
h_std_in = GetStdHandle( // Get the stdin handler
STD_INPUT_HANDLE // enumerator for stdin. Others exist for stdout and stderr
);
while(1) {
ReadConsoleInput( // Read the input from the handler
h_std_in, // our handler
input_buffer, // the vector in which events will be saved
128, // the dimension of the vector
&read_count); // the number of events captured and saved (always < 128 in this case)
for (i = 0; i < read_count; i++) { // and here we iterate from 0 to read_count
switch(input_buffer[i].EventType) { // let's check the type of event
case KEY_EVENT: // to intercept the keyboard ones
if (Kev.bKeyDown) { // and refine only on key pressed (avoid a second event for key released)
// Intercepts CTRL + D
if (Kev.uChar.AsciiChar != 4)
printf("%c", Kev.uChar.AsciiChar);
else
return 0;
}
break;
default:
break;
}
}
}
return 0;
}
while(getchar() != EOF)
{
if( getchar() == EOF )
break;
}
return 0;
Here it is inconsistent.
If getchar() != EOF it will enter the loop, otherwise (if getchar() == EOF) it will not enter the loop. So, there is no reason to check getchar() == EOF inside the loop.
On the other hand, you call getchar() 2 times, you wait to enter 2 characters instead of only 1.
What did you try to do ?

EOF in C doesnt work

I have a problem with reading chars from input. Program should end when I press ENTER or CTRL + Z.
Example input:
LoreCTRL+Z .... is cycling
but when I press CTRL+Z and there is no text before it, it works.
Could anybody help me? Thanks
int intFromConsole = getchar();
if((intFromConsole == EOF) || (intFromConsole == '\n')){
//code
}
That is probably not related to your program but to the terminal subsystem (in the kernel) which is responsible for that behaviour.
Your terminal is usually in "line discipline" state. Notice that your program doesn't receive the "Lore" directly but only when you press return. (the input is line-buffered).
Another catch, you usually shouldn't expect Ctrl+Z (ASCII 26) as input since that is intercepted by the terminal subsystem in most terminal states, which pauses your program and sends it to the background.
You can get more interesting information from Linus Akesson's article The TTY demystified.
The function getchar() uses buffered input. In Windows, Ctrl-Z is only recognised when it follows a newline entry. But in Windows, you might have conio.h available to you in the library. If so, here is something for you to experiment with. It accepts each keystroke immediately, and prints the value in hex. It exits when Ctrl-Z is pressed.
#include <stdio.h>
#include <conio.h>
#define ENDOF 0x1A // Ctrl-Z
int main(void) {
int ch;
do {
ch = _getch();
printf("%02X\n", ch);
} while(ch != ENDOF);
}
Program input
1234<Ctrl-Z>
Program output
31
32
33
34
1A
Program should end when I press ENTER or CTRL + Z.
If above is what you want, you could do something like :
int main(void)
{
char ch;
do
{
while(ch!='\n')
getchar(); // waste the buffer
/* Do the useful stuff here */
ch=getchar();
}while(ch!=EOF && ch!='\n');
/* Note the logical operator is && . It says if the key pressed
* is neither Enter nor Ctrl -Z do some stuff.
* The program exits only when you directly press Enter key or
* Ctrl -Z / Ctrl -D. Okay ! Now it is time to take a nice walk
* thru the Grand Central Park.
*/
return 0;
}
For Linux machines Ctrl-Z generates the SIGSTOP which cannot be trapped. So to signal EOF you should instead use Ctrl-D.
References :
Linux Signals

getchar buffered input, EOF and terminal driver

I'm trying to understand how the terminal driver works in conjunction with getchar. Here are a few sample codes i wrote while reading KandR:
Code 1:
#include <stdio.h>
int main(){
int c = getchar();
putchar(c);
return 0;
}
Code 2:
#include <stdio.h>
int main(){
int c = EOF;
while((c=getchar()) != EOF){
printf("%c",c);
}
return 0;
}
Code 3:
//barebones program that emulates functionality of wc command
#include <stdio.h>
#define IN 1
#define OUT 0
int main(){
//nc= number of characters, ns = number of spaces, bl=number of newlines, nw=number of words
int c = EOF,nc=0,nw=0,ns=0,nl=0, state = OUT;
while((c=getchar())!=EOF){
++nc;
if(c=='\n'){
++nl;
state = OUT;
}
else if(c==' '){
++ns;
state = OUT;
}
else{
if(state == OUT){
state = IN;
++nw;}
}
}
printf("\n%d %d %d %d",nc,nw,ns,nl);
return 0;
}
I wish to understand when the terminal driver actually hands over the input string to the program. Assume my input is the string "this is a test" and i press enter, then here is how the above mentioned codes work:
code 1: outputs "t" (and the program ends)
code 2: outputs "this is a test", jumps to the next line (because it also outputs the enter i pressed) and waits again for input.
code 3: does not output anything for the above string followed by an enter. I need to press Ctrl+D for the output to be displayed (output is 15 4 3 1)
1) Why in case of code 3 do i need to press Ctrl+D (EOF) explicitly for the input to be sent to my program? To put this in other words, why was my input string sent to my program in case of code 1 and code 2 after i pressed enter? Why didn't it also ask for EOF?
2) Also, in case of code 3, if i do not press enter after the input string, i need to press Ctrl+D twice for the output to be displayed. Why is this the case?
EDIT:
For another input say "TESTING^D", here is how the above codes work:
1) outputs "T" and ends
2) outputs "TESTING" and waits for more input
3) ouputs nothing until another Ctrl+D is pressed. then it outputs 7 1 0 0.
In case of this input, the terminal driver sends the input string to the program when Ctrl+D is received in case of code 1 and code 2. Does that mean /n and Ctrl+D are treated the same way i.e. they both serve as a marker for the terminal driver to send the input to the program? Then why i need to press Ctrl+D twice for the second case?
This http://en.wikipedia.org/wiki/End-of-file says that the driver converts Ctrl+D into an EOF when it is on a newline. But in case of my "TESTING^D" input, it works fine even though the ^D is on the same line as the rest of the input. What can be the possible explanation for this?
General information:
In case code 2: you also need to do ctrl+D in order to exit.
In fact EOF is achieved by pressing ctrl+D, so what your while loop condition says:
get input from keyboard
store it in c
if the input was not equal to EOF execute the body of the while loop
EOF is nothing but the integer -1, and this can be achieved in the terminal by pressing ctrl+D. So taking this example:
while((c=getchar()) != EOF){
// execute code
}
printf("loop has exited because you press ctrl+D");
The condition keeps taking input but stops when you press ctrl+D, then it continue to execute the rest of the code.
Answering you questions:
1) Why in case of code 3 do i need to press Ctrl+D (EOF) explicitly
for the input to be sent to my program? To put this in other words,
why was my input string sent to my program in case of code 1 and code
2 after i pressed enter? Why didn't it also ask for EOF?
In code 2 and 3 (Not only 3), you need to press Ctrl+D because the while loop stops taking input from keyboard only when it reads EOF. In code 1 you are not looping, so when you enter one or more character, the program will read the characters entered but will store the first one only, then it will print it and terminate the program, so no need for EOF in that case because you are not asking for it anywhere in any condition.
2) Also, in case of code 3, if i do not press enter after the input
string, i need to press Ctrl+D twice for the output to be displayed.
Why is this the case?
If you started typing when the program is expecting input, then after typing at least one character you press ctrl+D, this will tell the program to stop taking input and return the entered characters. After that, if you press ctrl+D again without entering any character before, this will return EOF which will then not satisfy the condition of the while loop and skip to continue executing the rest of the code

EOF in Windows command prompt doesn't terminate input stream

Code:
#include <stdio.h>
#define NEWLINE '\n'
#define SPACE ' '
int main(void)
{
int ch;
int count = 0;
while((ch = getchar()) != EOF)
{
if(ch != NEWLINE && ch != SPACE)
count++;
}
printf("There are %d characters input\n" , count);
return 0;
}
Question:
Everything works just fine, it will ignore spaces and newline and output the number of characters input to the screen (in this program I just treat comma, exclamation mark, numbers or any printable special symbol character like ampersand as character too) when I hit the EOF simulation which is ^z.
But there's something wrong when I input this line to the program. For example I input this: abcdefg^z, which means I input some character before and on the same line as ^z. Instead of terminating the program and print out total characters, the program would continue to ask for input.
The EOF terminating character input only works when I specify ^z on a single line or by doing this: ^zabvcjdjsjsj. Why is this happening?
This is true in almost every terminal driver. You'll get the same behavior using Linux.
Your program isn't actually executing the loop until \n or ^z has been entered by you at the end of a line. The terminal driver is buffering the input and it hasn't been sent to your process until that occurs.
At the end of a line, hitting ^z (or ^d on Linux) does not cause the terminal driver to send EOF. It only makes it flush the buffer to your process (with no \n).
Hitting ^z (or ^d on Linux) at the start of a line is interpreted by the terminal as "I want to signal EOF".
You can observe this behavior if you add the following inside your loop:
printf("%d\n",ch);
Run your program:
$ ./test
abc <- type "abc" and hit "enter"
97
98
99
10
abc97 <- type "abc" and hit "^z"
98
99
To better understand this, you have to realize that EOF is not a character. ^z is a user command for the terminal itself. Because the terminal is responsible for taking user input and passing it to processes, this gets tricky and thus the confusion.
A way to see this is by hitting ^v then hitting ^z as input to your program.
^v is another terminal command that tells the terminal, "Hey, the next thing I type - don't interpret that as a terminal command; pass it to the process' input instead".
^Z is only translated by the console to an EOF signal to the program when it is typed at the start of a line. That's just the way that the Windows console works. There is no "workaround" to this behaviour that I know of.

Why does this getchar() loop stop after one character has been entered?

#include <stdio.h>
int main() {
char read = ' ';
while ((read = getchar()) != '\n') {
putchar(read);
}
return 0;
}
My input is f (followed by an enter, of course). I expect getchar() to ask for input again, but instead the program is terminated. How come? How can I fix this?
The Terminal can sometimes be a little bit confusing. You should change your program to:
#include <stdio.h>
int main() {
int read;
while ((read = getchar()) != EOF) {
putchar(read);
}
return 0;
}
This will read until getchar reads EOF (most of the time this macro expands to -1) from the terminal. getchar returns an int so you should make your variable 'read' into an integer, so you can check for EOF. You can send an EOF from your terminal on Linux with ^D and I think on windows with ^Z (?).
To explain a little bit what happens. In your program the expression
(read = getchar()) !='\n'
will be true as long as no '\n' is read from the buffer. The problem is, to get the buffer to your program, you have to hit enter which corresponds to '\n'.
The following steps happen when your program is invoked in the terminal:
~$\a.out
this starts your program
(empty line)
getchar() made a system call to get an input from the terminal and the terminal takes over
f
you made an input in the terminal. The 'f' is written into the buffer and echoed back on the terminal, your program has no idea about the character yet.
f
f~$
You hit enter. Your buffer contains now 'f\n'. The 'enter' also signals to the terminal, that it should return to your program. Your progam
reads the buffer and will find the f and put it onto the screen and then find an '\n' and immediatley stop the loop and end your program.
This would be standard behaviour of most terminals. You can change this behaviour, but that would depend on your OS.
getchar() returns the next character from the input stream. This includes of course also newlines etc. The fact that you don't see progress in your loop unless you press 'Enter' is caused by the fact that your file I/O (working on stdin) doesn't hand over the input buffer to getchar() unless it detects the '\n' at the end of the buffer. Your routine first blocks then handles the two keystrokes in one rush, terminating, like you specified it, with the appearance of '\n' in the input stream. Facit: getchar() will not remove the '\n' from the input stream (why should it?).
after f you are putting "enter" which is '/n'.
so the loop ends there.
if you want to take another character just keep on putting them one after the other as soon as enter is pressed the loop exits.
You've programmed it so the loop ends when you read a \n (enter), and you then return 0; from main which exits the program.
Perhaps you want something like
while ((read = getchar()) != EOF) {
putchar(read);
}
On nx terminals you can press Control-D which will tell the tty driver to return the input buffer to the app reading it. That's why ^D on a new line ends input - it causes the tty to return zero bytes, which the app interprets as end-of-file. But it also works anywhere on a line.

Resources