I'm using the readline library in C to create a bash-like prompt within bash. When I tried to make the prompt colorful, with color sequences like these, the coloring works great, but the cursor spacing is messed up. The input is wrapped around too early and the wrap-around is to the same line so it starts overwriting the prompt. I thought I should escape the color-sequences with \[ and \] like
readline("\[\e[1;31m$\e[0m\] ")
But that prints the square brackets, and if I escape the backslashes it prints those too. How do I escape the color codes so the cursor still works?
The way to tell readline that a character sequence in a prompt string doesn't actually move the cursor when output to the screen is to surround it with the markers RL_PROMPT_START_IGNORE (currently, this is the character literal '\001' in readline's C header file) and RL_PROMPT_END_IGNORE (currently '\002').
And as #Joachim and #Alter said, use '\033' instead of '\e' for portability.
I found this question when looking to refine the GNU readline prompt in a bash script. Like readline in C code, \[ and \] aren't special but \001 and \002 will work when given literally via the special treatment bash affords quoted words of the form $'string'. I've been here before (and left unsatisfied due to not knowing to combine it with $'…'), so I figured I'd leave my explanation here now that I have a solution.
Using the data provided here, I was able to conclude this result:
C1=$'\001\033[1;34m\002' # blue - from \e[1;34m
C0=$'\001\033[0;0m\002' # reset - from \e[0;0m
while read -p "${C1}myshell>$C0 " -e command; do
echo "you said: $command"
done
This gives a blue prompt that says myshell> and has a trailing space, without colors for the actual command. Neither hitting Up nor entering a command that wraps to the next line will be confused by the non-printing characters.
As explained in the accepted answer, \001 (Start of Heading) and \002 (Start of Text) are the RL_PROMPT_START_IGNORE and RL_PROMPT_END_IGNORE markers, which tell bash and readline not to count anything between them for the purpose of painting the terminal. (Also found here: \033 is more reliable than \e and since I'm now using octal codes anyway, I might as well use one more.)
There seems to be quite the dearth of documentation on this; the best I could find was in perl's documentation for Term::ReadLine::Gnu, which says:
PROMPT may include some escape sequences. Use RL_PROMPT_START_IGNORE to begin a sequence of non-printing characters, and RL_PROMPT_END_IGNORE to end the sequence.
What is the meaning of the following control characters:
Carriage return
Line feed
Form feed
Carriage return means to return to the beginning of the current line without advancing downward. The name comes from a printer's carriage, as monitors were rare when the name was coined. This is commonly escaped as "\r", abbreviated CR, and has ASCII value 13 or 0xD.
Linefeed means to advance downward to the next line; however, it has been repurposed and renamed. Used as "newline", it terminates lines (commonly confused with separating lines). This is commonly escaped as "\n", abbreviated LF or NL, and has ASCII value 10 or 0xA. CRLF (but not CRNL) is used for the pair "\r\n".
Form feed means advance downward to the next "page". It was commonly used as page separators, but now is also used as section separators. Text editors can use this character when you "insert a page break". This is commonly escaped as "\f", abbreviated FF, and has ASCII value 12 or 0xC.
As control characters, they may be interpreted in various ways.
The most important interpretation is how these characters delimit lines. Lines end with NL on Unix (including OS X), CRLF on Windows, and CR on older Macs. Note the shift in meaning from LF to NL, for the exact same character, gives the differences between Windows and Unix, which is also why many Windows programs use CRLF to separate instead of terminate lines. Many text editors can read files in any of these three formats and convert between them, but not all utilities can.
Form feed is much less commonly used. As page separator, it can only come between lines or at the start or end of the file.
\r is carriage return and moves the cursor back like if i will do-
printf("stackoverflow\rnine")
ninekoverflow
means it has shifted the cursor to the beginning of "stackoverflow" and overwrites the starting four characters since "nine" is four character long.
\n is new line character which changes the line and takes the cursor to the beginning of a new line like-
printf("stackoverflow\nnine")
stackoverflow
nine
\f is form feed, its use has become obsolete but it is used for giving indentation like
printf("stackoverflow\fnine")
stackoverflow
nine
if i will write like-
printf("stackoverflow\fnine\fgreat")
stackoverflow
nine
great
In short:
Carriage return (\r or 0xD): To take control at starting on the same line.
Line feed (\n or 0xA): To Take control at starting on the next line.
Form feed (\f or 0xC): To take control at starting on the next page.
More details and more control characters can be found on the following page: Control character
Have a look at Wikipedia:
Systems based on ASCII or a compatible character set use either LF (Line feed, '\n', 0x0A, 10 in decimal) or CR (Carriage return, '\r', 0x0D, 13 in decimal) individually, or CR followed by LF (CR+LF, 0x0D 0x0A). These characters are based on printer commands: The line feed indicated that one line of paper should feed out of the printer, and a carriage return indicated that the printer carriage should return to the beginning of the current line.
\f is used for page break.
You cannot see any effect in the console. But when you use this character constant in your file then you can see the difference.
Other example is that if you can redirect your output to a file then you don't have to write a file or use file handling.
For ex:
Write this code in c++
void main()
{
clrscr();
cout<<"helloooooo" ;
cout<<"\f";
cout<<"hiiiii" ;
}
and when you compile this it generate an exe(for ex. abc.exe)
then you can redirect your output to a file using this:
abc > xyz.doc
then open the file xyz.doc you can see the actual page break between hellooo and hiiii....
On old paper-printer terminals, advancing to the next line involved two actions: moving the print head back to the beginning of the horizontal scan range (carriage return) and advancing the roll of paper being printed on (line feed).
Since we no longer use paper-printer terminals, those actions aren't really relevant anymore, but the characters used to signal them have stuck around in various incarnations.
Apart from above information, there is still an interesting history of LF (\n) and CR (\r).
[Original author : 阮一峰 Source : http://www.ruanyifeng.com/blog/2006/04/post_213.html]
Before computer came out, there was a type of teleprinter called Teletype Model 33. It can print 10 characters each second. But there is one problem with this, after finishing printing each line, it will take 0.2 second to move to next line, which is time of printing 2 characters. If a new characters is transferred during this 0.2 second, then this new character will be lost.
So scientists found a way to solve this problem, they add two ending characters after each line, one is 'Carriage return', which is to tell the printer to bring the print head to the left.; the other one is 'Line feed', it tells the printer to move the paper up 1 line.
Later, computer became popular, these two concepts are used on computers. At that time, the storage device was very expensive, so some scientists said that it was expensive to add two characters at the end of each line, one is enough, so there are some arguments about which one to use.
In UNIX/Mac and Linux, '\n' is put at the end of each line, in Windows, '\r\n' is put at the end of each line. The consequence of this use is that files in UNIX/Mac will be displayed in one line if opened in Windows. While file in Windows will have one ^M at the end of each line if opened in UNIX or Mac.
Consider an IBM 1403 impact printer. CR moved the print head to the start of the line, but did NOT advance the paper. This allowed for "overprinting", placing multiple lines of output on one line. Things like underlining were achieved this way, as was BOLD print. LF advanced the paper one line. If there was no CR, the next line would print as a staggered-step because LF didn't move the print head. FF advanced the paper to the next page. It typically also moved the print head to the start of the first line on the new page, but you might need CR for that. To be sure, most programmers coded CRFF instead of CRLF at the end of the last line on a page because an extra CR created by FF wouldn't matter.
As a supplement,
1, Carriage return: It's a printer terminology meaning changing the print location to the beginning of current line. In computer world, it means return to the beginning of current line in most cases but stands for new line rarely.
2, Line feed: It's a printer terminology meaning advancing the paper one line. So Carriage return and Line feed are used together to start to print at the beginning of a new line. In computer world, it generally has the same meaning as newline.
3, Form feed: It's a printer terminology, I like the explanation in this thread.
If you were programming for a 1980s-style printer, it would eject the
paper and start a new page. You are virtually certain to never need
it.
http://en.wikipedia.org/wiki/Form_feed
It's almost obsolete and you can refer to Escape sequence \f - form feed - what exactly is it? for detailed explanation.
Note, we can use CR or LF or CRLF to stand for newline in some platforms but newline can't be stood by them in some other platforms. Refer to wiki Newline for details.
LF: Multics, Unix and Unix-like systems (Linux, OS X, FreeBSD, AIX,
Xenix, etc.), BeOS, Amiga, RISC OS, and others
CR: Commodore 8-bit machines, Acorn BBC, ZX Spectrum, TRS-80, Apple
II family, Oberon, the classic Mac OS up to version 9, MIT Lisp
Machine and OS-9
RS: QNX pre-POSIX implementation
0x9B: Atari 8-bit machines using ATASCII variant of ASCII (155 in
decimal)
CR+LF: Microsoft Windows, DOS (MS-DOS, PC DOS, etc.), DEC TOPS-10,
RT-11, CP/M, MP/M, Atari TOS, OS/2, Symbian OS, Palm OS, Amstrad CPC,
and most other early non-Unix and non-IBM OSes
LF+CR: Acorn BBC and RISC OS spooled text output.
"\n" is the linefeed character. It means end the present line and go to a new line for anyone who is reading it.
Carriage return and line feed are also references to typewriters, in that the with a small push on the handle on the left side of the carriage (the place where the paper goes), the paper would rotate a small amount around the cylinder, advancing the document one line. If you had finished typing one line, and wanted to continue on to the next, you pushed harder, both advancing a line and sliding the carriage all the way to the right, then resuming typing left to right again as the carriage traveled with each keystroke. Needless to say, word-wrap was the default setting for all word processing of the era. P:D
Those are non-printing characters, relating to the concept of "new line". \n is linefeed. \r is carriage return. On different platforms they have different meanings, relative to a valid new line. In windows, a new line is \r\n. In linux, \n. In mac, \r.
In practice, you put them in any string, and it will have effect on the print-out of the string.
when I was an apprentice in the Royal Signals many (50) years ago, teletypes and typewriters had "Carriage" with the printing head on them. When you pressed RETURN the Carriage would fly to the left. Hence Carriage Return (CR). You could just return the Carriage, but on mechanical typewriters, you'd use the Lever (much like a tremolo lever on an electric guitar) which would also do the Line Feed. Your next question is why would you not want the line feed? heh heh well in those days to delete characters we'd do a CR then use a Tip-ex-like paper in between the hammerheads and paper and type the same keys to over-write with white ink. Some fancy typewriters had a key you could press. So there you go.
Some applications running in terminal can erase their outputs. e.g.
when it tells you to wait, it will show a sequence of dots alternating between different lengths.
How is erasing output in terminal implemented in C? Is it done by reverse line feed?
Can a program only erase the previous characters in the current line, not the characters in the previous line in stdout?
Thanks.
It depends on the terminal.
The COMSPEC shell on Windows (often called the DOS prompt or command.com) exposes an API in C to control the cursor. I haven't done any Windows programming so I can't tell you much about it.
Most other terminals (especially on unixen) emulate protocols that resemble the VT100 serial terminal (the VT100 terminal was a physical device, a monitor and keyboard, that you attached to a modem or serial port to communicate with a server).
On VT100 terminals, carriage return and line feed are separate commands, both one byte. The carriage return command sets the cursor to the beginning of the line. The line feed command moves the cursor down a line (but doesn't bring the cursor to the beginning of the line by itself). Most shells on unixen automatically insert a carriage return after a line feed but almost none inserts a line feed after a carriage return.
With this knowledge, the simplest implementation is to simply output a carriage return and reprint the entire line:
printf("\rprogress: %d percent ", x);
Note the extra spaces at the end of the line. Printing "\r" doesn't erase the line so reprinting over the old line may end up leaving some of the old string on screen. The extra spaces is used to try and erase the remainder of the old line.
If you googled "VT100 escape secquence", you'll find commands that will allow you to do things like erase the current line, change color of text, goto a specific row/column on screen etc. The most popular use of VT100 sequences is to output coloured text. But you can also do other things with them.
The next simplest implementation is to cleanly delete the line and reprint it:
printf("\033[2K\rprogress: %d percent", x);
The \033[2K is the escape sequence to delete the current line (ESC[2K).
From here you can get more creative if you want. You can use the cursor save/restore command with the erase until end of line command to only erase the part you want to update (instead of the entire line). You can use the goto commands to put the cursor in a specific location on screen to update text there etc.
It should be noted that the more advanced stuff such as VT102 sequences or some of the full ANSI escape sequences are generally not portable accross terminals (by terminals I don't mean the shell, I mean the terminals: rxvt, xterm, linux terminal, hyperterminal(on windows) etc).
If you want portability (and/or sane API) you should use the curses or ncurses libraries.
If you wanted to know how it's done, then that's how it's done. It's just printing specially formatted strings to screen (except for the COMSPEC shell). Kind of like HTML but old-school.
I expect this simple line of code
printf("foo\b\tbar\n");
to replace "o" with "\t" and to produce the following output
fo bar
(assuming that tab stop occurs every 8 characters).
On the contrary I get
foo bar
It seems that my shell interprets \b as "move the cursors one position back" and \t as "move cursor to the next tab stop". Is this behaviour specific to the shell in which I'm running the code? Should I expect different behaviour on different systems?
No, that's more or less what they're meant to do.
In C (and many other languages), you can insert hard-to-see/type characters using \ notation:
\a is alert/bell
\b is backspace/rubout
\n is newline
\r is carriage return (return to left margin)
\t is tab
You can also specify the octal value of any character using \0nnn, or the hexadecimal value of any character with \xnn.
EG: the ASCII value of _ is octal 137, hex 5f, so it can also be typed \0137 or \x5f, if your keyboard didn't have a _ key or something. This is more useful for control characters like NUL (\0) and ESC (\033)
As someone posted (then deleted their answer before I could +1 it), there are also some less-frequently-used ones:
\f is a form feed/new page (eject page from printer)
\v is a vertical tab (move down one line, on the same column)
On screens, \f usually works the same as \v, but on some printers/teletypes, it will
go all the way to the next form/sheet of paper.
Backspace and tab both move the cursor position. Neither is truly a 'printable' character.
Your code says:
print "foo"
move the cursor back one space
move the cursor forward to the next tabstop
output "bar".
To get the output you expect, you need printf("foo\b \tbar"). Note the extra 'space'. That says:
output "foo"
move the cursor back one space
output a ' ' (this replaces the second 'o').
move the cursor forward to the next tabstop
output "bar".
Most of the time it is inappropriate to use tabs and backspace for formatting your program output. Learn to use printf() formatting specifiers. Rendering of tabs can vary drastically depending on how the output is viewed.
This little script shows one way to alter your terminal's tab rendering. Tested on Ubuntu + gnome-terminal:
#!/bin/bash
tabs -8
echo -e "\tnormal tabstop"
for x in `seq 2 10`; do
tabs $x
echo -e "\ttabstop=$x"
done
tabs -8
echo -e "\tnormal tabstop"
Also see man setterm and regtabs.
And if you redirect your output or just write to a file, tabs will quite commonly be displayed as fewer than the standard 8 chars, especially in "programming" editors and IDEs.
So in otherwords:
printf("%-8s%s", "foo", "bar"); /* this will ALWAYS output "foo bar" */
printf("foo\tbar"); /* who knows how this will be rendered */
IMHO, tabs in general are rarely appropriate for anything. An exception might be generating output for a program that requires tab-separated-value input files (similar to comma separated value).
Backspace '\b' is a different story... it should never be used to create a text file since it will just make a text editor spit out garbage. But it does have many applications in writing interactive command line programs that cannot be accomplished with format strings alone. If you find yourself needing it a lot, check out "ncurses", which gives you much better control over where your output goes on the terminal screen. And typically, since it's 2011 and not 1995, a GUI is usually easier to deal with for highly interactive programs. But again, there are exceptions. Like writing a telnet server or console for a new scripting language.
The C standard (actually C99, I'm not up to date) says:
Alphabetic escape sequences representing nongraphic characters in the execution character set are intended to produce actions on display devices as follows:
\b (backspace) Moves the active position to the previous position on the current line. [...]
\t (horizontal tab) Moves the active position to the next horizontal tabulation position on the current line. [...]
Both just move the active position, neither are supposed to write any character on or over another character. To overwrite with a space you could try: puts("foo\b \tbar"); but note that on some display devices - say a daisy wheel printer - the o will show the transparent space.
This behaviour is terminal-specific and specified by the terminal emulator you use (e.g. xterm) and the semantics of terminal that it provides. The terminal behaviour has been very stable for the last 20 years, and you can reasonably rely on the semantics of \b.
\t is the tab character, and is doing exactly what you're anticipating based on the action of \b - it goes to the next tab stop, then gets decremented, and then goes to the next tab stop (which is in this case the same tab stop, because of the \b.
I'm writing a filter (in a pipe destined for a terminal output) that sometimes needs to "overwrite" a line that has just occurred. It works by passing stdin to stdout character-by-character until a \n is reached, and then invoking special behaviour. My problem regards how to return to the beginning of the line.
The first thing I thought of was using a \r or the ANSI sequence \033[1G. However, if the line was long enough to have wrapped on the terminal (and hence caused it to scroll), these will only move the cursor back to the current physical line.
My second idea was to track the length of the line (number of characters passed since previous \n), and then echo \b that many times. However, that goes wrong if the line contained control characters or escape sequences (and possibly Unicode?).
Short of searching for all special sequences and using this to adjust my character count, is there a simple way to achieve this?
Even if there were a "magic sequence" that when written to a console would reliably erase the last written line, you would STILL get the line and the sequence on the output (though hidden on a console). Think what would happen if somebody wrote the output to a file, or passed it down the pipe to other filters? Would they know how to handle such input? And don't tell me you rule out the possibility of writing somewhere else than directly to a console. Sooner or later, somebody WILL want to redirect the output - maybe even you!
The Right Way to do this is to buffer each line in memory as it is processed, and then decide whether to output it or not. There's really no way around this.
$ cat >test.sh <<'EOF'
> #!/bin/sh
> tput sc
> echo 'Here is a really long multi-line string: .............................................................................................'
> tput rc
> echo 'I went back and overwrote some stuff!!!!'
> echo
> EOF
$ sh test.sh
I went back and overwrote some stuff!!!! .......................................
......................................................
Look for the save_cursor and restore_cursor string capabilities in the terminfo database.
You can query terminal dimensions with a simple ioctl:
#include <sys/types.h>
#include <sys/ioctl.h>
// ...
struct winsize ws;
ioctl(1, TIOCGWINSZ, &ws);
// ws.ws_col, ws.ws_row should now contain terminal dimensions
This way you can prevent printing anything beyond the end of line and simply use the \r method.