I am working on the terminal input functionality in an application that can receive input via terminal. I have canonical mode and echo disabled so all terminal input is truly processed by the program (with some exceptions, like ^C, etc.). One reason for doing this is input cannot block and by default the kernel buffers input until it gets a new line, and I really need character by character processing.
For the most part, this isn't an issue. I can handle things like backspace correctly. I can even handle escape-sequence characters like the left arrow key by doing this multiple times.
A challenge I have encountered is how to insert characters on the terminal, sort of like using the "INSERT" key (on by default), in a graphical application. I can echo the left arrow sequence back to the terminal to get the cursor to shift left one character on the terminal.
However, if I type a character, it then overwrites the character present there (if one). This isn't unexpected, but I would like to optionally be able to insert the character inbetween the two, the same way that any regular "cooked" TTY would do it. I tried sending the "INSERT" key escape sequence to the terminal, followed by echoing the typed key, and this causes all kinds of terminal chaos to break loose. If I type t, it minimizes the entire terminal window. Typing r moves the cursor to the top of the screen, and e does a page down of the cursor.
"Something" is happening here, but this isn't quite what I was looking for.
What escape sequence would I need to send to the TTY in order to have it insert the character, as opposed to replace?
For reference, this is what I'm sending to the TTY for "Insert":
27, 91, 50.
These are the raw values written out to the terminal output file descriptor using write.
If I then send an 'e' to the TTY, that causes the window to minimize (SSH session on Windows).
Some escape sequences, like left arrow, are bidirectional in that sending the left arrow to the server also works if sent to the client, in accomplishing what you'd think it would, but it seems the same may not be true for all of them. Is there any way I can force the next character to be inserted and shift characters down, as opposed to overwriting?
An alternative would be to print \r and then rewrite the entire line as it existed so far, but I'm trying to avoid that. Essentially, I'm trying to emulate however a higher-level line editor would do this, but I can't find any details on the escaping for this. This page has a good example, but it seems to rewrite the entire line.
Related
I'm trying to write a mini shell in C for a school project, and what I want to do is doing a sort of command history (like in shell), when I press the UP key it writes the previous input into the input part, DOWN does the opposite, etc..., and you can edit it before pressing enter to send it to the program, like this (sorry for the bad english): [] represents the user cursor
my_shell$ some input wrote by me
my_shell$ []
my_shell$ some other input
my_shell$ []
and now if I press UP
my_shell$ some other input[]
If I press UP again
my_shell$ some input wrote by me[]
I'm allowed to use termcaps and some other functions isatty, ttyname, ttyslot, ioctl, getenv, tcsetattr, tcgetattr, tgetent, tgetflag, tgetnum, tgetstr, tgoto, tputs.
The problem is I can't understand the documentation of ioctl and tty functions, and I can't find well explained tutorials on these functions with examples, and I can't find the documentation for what I'm trying to do with them.
Can someone explain me those functions in an understandable way ? And how should I apply them for what I'm trying to do (I'm searching for a Linux-MacOs compatibility way)
Thanks for your help.
What you are asking is non trivial and will require a fair amount of work. Basically, what you need to do is use tcsetattr to put the terminal into non-canonical mode where the terminal's input line buffering is disabled and every keystroke will be returned immediately rather than waiting for a newline. You will then have to process every keystroke yourself, including backspace/delete and up/down arrows. Because of what you want to do with line editing, you'll probably also have to disable echoing and do the echo yourself.
You'll need to maintain the current line buffer yourself, and you'll also need a data structure that stores all the older lines of input (the history) and when an up-arrow is hit, you'll need to erase what you've currently buffered for the input, and erase it from the screen, then copy the history into your current buffer and echo it to the terminal.
Another complication is that keys like up-arraow and down-arrow are not part of ascii, so won't be read as a single byte -- instead they'll be multibyte escape sequences (likely beginning with an ESC ('\x1b') character). You can use tgetstr to quesry the terminal database to figure out what they are, or just hardcode your shell to use the ANSI sequences which are what almost all terminals you will see these days use.
I would like to read characters from stdin until one of the following occurs:
an end-of-line marker is encountered (the normal case, in my thinking),
the EOF condition occurs, or
an error occurs.
How can I guarantee that one of the above events will happen eventually? In other words, how do I guarantee that getchar will eventually return either \n or EOF, provided that no error (in terms of ferror(stdin)) occurs?
// (How) can we guarantee that the LABEL'ed statement will be reached?
int done = 0;
while (!0) if (
(c = getchar()) == EOF || ferror(stdin) || c == '\n') break;
LABEL: done = !0;
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur. It seems like the answer will have to do with the properties of the device. Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition. Similarly for files stored on disc / SSD.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I am using C89, but I am curious if the answer depends on which C standard is used.
You can't.
Let's say I run your program, then I put a weight on my keyboard's "X" key and go on vacation to Hawaii. On the way there, I get struck by lightning and die.
There will never be any input other than 'x'.
Or, I may decide to type the complete story of Moby Dick, without pressing enter. It will probably take a few days. How long should your program wait before it decides that maybe I won't ever finish typing?
What do you want it to do?
Looking at all the discussion in the comments, it seems you are looking in the wrong place:
It is not a matter of keyboard drivers or wrapping stdin.
It is also not a matter of what programming language you are using.
It is a matter of the purpose of the input in your software.
Basically, it is up to you as a programmer to know how much input you want or need, and then decide when to stop reading input, even if valid input is still available.
Note, that not only are there devices that can send input forever without triggering EOF or end of line condition, but there are also programs that will happily read input forever.
This is by design.
Common examples can be found in POSIX style OS (like Linux) command line tools.
Here is a simple example:
cat /dev/urandom | hexdump
This will print random numbers for as long as your computer is running, or until you hit Ctrl+C
Though cat will stop working when there is nothing more to print (EOF or any read error), it does not expect such an end, so unless there is a bug in the implementation you are using it should happily run forever.
So the real question is:
When does your program need to stop reading characters and why?
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur.
A device such as /dev/zero, for example. Yes, stdin can be connected to a device that never provides a newline or reaches EOF, and that is not expected ever to report an error condition.
It seems like the answer will have to do with the properties of the device.
Indeed so.
Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
Generally, it's a question of the device driver. And in some cases (such as the /dev/zero example) that's all there is anyway. Generally drivers do things that are sensible for the underlying hardware, but in principle, they don't have to do.
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition.
No. Generally speaking, an end-of-line marker is sent by a terminal device if and only if the <enter> key is pressed. An end-of-file condition might be signaled if the terminal disconnects (but the program continues), or if the user explicitly causes one to be sent (by typing <-<D> on Linux or Mac, for example, or <-<Z> on Windows). Neither of those events need actually happen on any given run of a program, and it is very common for the latter not to do.
Similarly for files stored on disc / SSD.
You can generally rely on data read from an ordinary file to contain newlines where they are present in the file itself. If the file is open in text mode, then the system-specific text line terminator will also be translated to a newline, if it differs. It is not necessary for a file to contain any of those, so a program reading from a regular file might never see a newline.
You can rely on EOF being signaled when a read is attempted while the file position is at or past the and of the file's data.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I think you're trying too hard.
Reading to end-of-line might be a reasonable thing to do in some cases. Expecting a newline to eventually be reached is reasonable if the program is intended to support interactive use. But trying to ensure that invalid data cannot be fed to your program is a losing cause. Your objective should be to accept the widest array of inputs you reasonably can, and to fail gracefully when other inputs are presented.
If you need to read input in a line-by-line mode then by all means do that, and document that you do it. If only the first n characters of each line are significant to the program then document that, too. Then, if your program never terminates when a user connects its input to /dev/zero that's on them, not on you.
On the other hand, try to avoid placing arbitrary constraints, especially on sizes of things. If there is not a natural limit on the size of something, then no artificial limit you introduce will ever be enough.
I'm developing an app in C using ncurses, I want a box where I can get user input, only problem is, I can't find a way to do it that works the way I need it to.
For example, the closest ive come is the mvwgetnstr() routine, however this doesn't print the inputted characters back onto the window like I want it to. Ive searched around for quite a while now but I couldn't find anything.
Thanks for the help!
edit: just to clarify, I would need a routine like mvwgetnstr() just with the input being printed back onto the window.
The getstr manual page tells you:
Characters input are echoed only if echo is currently on. In that
case, backspace is echoed as deletion of the previous character (typically a left motion).
and the echo manual page gives you more information:
The echo and noecho routines control whether characters typed by the
user are echoed by getch(3x) as they are typed. Echoing by the tty
driver is always disabled, but initially getch is in echo mode, so
characters typed are echoed. Authors of most interactive programs prefer to do their own echoing in a controlled area of the screen, or not
to echo at all, so they disable echoing by calling noecho. [See
curs_getch(3x) for a discussion of how these routines interact with
cbreak and nocbreak.]
The ncurses manual page advised initializing it with noecho; your program can turn that off (or on), at any time.
I have an RFID tag reader. But it works like a HID device (like a keyboard). It sends keystrokes to the computer when a tag is scanned. When I open notepad and scan a tag - it types the ID one digit at a time. Is there a way to create a program to listen to this device (or this port) and capture (intercept) all input. So that the keystrokes wouldn't appear on my system but I could assign my own events when the device sends and input. I don't want it to show up on Notepad.
I realize that the implementation can differ depending on the OS and programming language used. Ideally, I would like to make this work on both Windows and Linux. I would prefer to use something like Node.js but I suppose C could also be good.
I would appreciate any hints or pointing me in the right direction.
You could open the raw input device for reading (basically ioctl with parameter EVIOCGRAB for Linux and RegisterRawInputDevices() for Windows as discussed here and here). However, the mechanisms are quite different for Windows and Linux, so you will end up implementing all the low-level logic twice.
It should also be possible to read the input data stream from the standard input just like you would read an input from the keyboard (e.g. scanf() or fgets() in C) with some logic that recognizes when a data set (= tag ID) is complete - the reader device might for example terminate an input with a newline '\n' or null character '\0'.
You should probably do this in a separate thread and have some kind of producer-consumer mechanism or event model for communication with your main application.
I am studying for an exam and I am confused as to how canonical vs. non-canonical input/output works in Unix (e.g., curses). I understand that there is a buffer to which "line disciplines" are applied for canonical input. Does this mean that the buffer is bypassed for non-canonical input, or does it simply mean that no line disciplines are applied? How does this process differ for input and output operations?
In the curses programs I have worked with that demonstrate canonical input, the input typed by a user is automatically entered either after a certain number of characters have been typed or a certain amount of time has passed. Are either of these things considered "line disciplines" or is this something else entirely?
For canonical input — think shell; actually, think good old-fashioned Bourne shell, since Bash and relatives have command-line editing. You type a line of input; if you make a mistake, you use the erase character (default is Backspace, usually; sometimes Delete) to erase the previous character. If you mess up completely, you can cancel the whole line with the line kill character (not completely standardized, often Control-X). On some systems, you get a word erase with Control-W. All this is canonical input. The entire line is gathered and edited up until the end of line character — Return — is pressed. Thereupon, the whole line is made available to waiting programs. Depending on the read() system calls that are outstanding, the whole line will be available to be read (by one or more calls to read()).
For non-canonical input — think vi or vim or whatever — you press a character, and it is immediately available to the program. You aren't held up until you hit return. The system does no editing of the characters; they are made available to the program as soon as they are typed. It is up to the program to interpret things appropriately. Now, vim does do a number of things that look a bit like canonical input. For example, backspace moves backwards, and in input mode erases what was there. But that's because vim chooses to make it behave like that.
Canonical and non-canonical output is a much less serious business. There are a few bits and pieces of difference, related to things like whether to echo carriage-return before line-feed, and whether to do delays (not necessary with electronics; important in the days when the output device might have been a 110-baud teletype). It can also do things like handle case-insensitive output devices — teletypes, again. Lower-case letters are output in caps, and upper-case letters as backslash and caps.
It used to be that if you typed all upper-case letters to the login prompt, then the login program would automatically convert to the mode where all caps were output with a backslash in front of each actual capital. I suspect that this is no longer done on electronic terminals.
In a comment, TitaniumDecoy asked:
So with non-canonical input, is the input buffer bypassed completely? Also, where do line disciplines come in?
With non-canonical input, the input buffer is still used; if there is no program with a read() call waiting for input from the terminal, the characters are held in the input buffer. What doesn't happen is any editing of the input buffer.
Line disciplines are things like the set of manipulations that the input editing does. So, one aspect of the line discipline is that the erase character erases a prior character in canonical input mode. If you have icase (input case-mapping) set, then upper-case characters are mapped to lower-case unless preceded by a backslash; that is a line discipline, I believe, or an aspect of a line discipline.
I forgot to mention that EOF processing (Control-D) is handled in canonical mode; it actually means 'make the accumulated input available to read()'; if there is no accumulated input (if you type Control-D at the beginning of a line), then the read() will return zero bytes, which is then interpreted as EOF by programs. Of course, you can merrily type more characters on the keyboard after that, and programs that ignore EOF (or run in non-canonical mode) will be quite happy.
Of course, in canonical mode, the characters typed at the keyboard are normally echoed to the screen; you can control whether that echoing occurs. However, this is somewhat tangential to canonical input; the normal editing occurs even when echo is off.
Similarly, the interrupt and quit signals are artefacts of canonical mode processing. So too are the job control signals such as Control-Z to suspend the current process and return to the shell. Likewise, flow control (Control-S, Control-Q to stop and start output) is provided by canonical mode.
Chapter 4 of Rochkind's Advanced Unix Programming, 2nd Edn covers terminal I/O and gives much of this information — and a whole lot more. Other UNIX programming books (at least, the good ones) will also cover it.