Confused with interesting printf() statement - c

By reading this code, I stumbled upon the following printf() statement:
// reset, hide cursor and clear screen
printf("\e[0m\e[?25l\e[2J");
I must admit that I am not a fully qualified C hacker and do not fully understand this. I tweaked around, removing the arguments, and I understand what it does (well, the comment actually says it all), but I have no idea how it's done. Also, this is something kind of hard to google for.
How does this printf() call work?

This doesn't really have anything to do with printf. The C11 standard lists escape sequences in §5.2.2, and the list consists of \a, \b, \f, \n, \r, \t and \v. As an extension, GCC considers \e to be an escape sequence which stands for the ASCII character Esc (\E may work as well, or your compiler may support neither of them. Consult the documentation for your compiler). What follows are non-portable control sequences. They are not guaranteed to work the same in all terminals, or even work at all. The best way to know is to consult the documentation for your system.
§6.4.4.4 also describes octal escape sequences. For example, \033, where 033 is 27 in decimal, and therefore the escape character in ASCII. Similarly, you can use \x1b, which is a hexadecimal escape sequence specifying the same character.
If we inspect the output of the program with od -c, it shows 033.
(✿´‿`) ~/test> ./a.out | od -c
0000000 033 [ 0 m 033 [ ? 2 5 l 033 [ 2 J
0000016
The ANSI escape sequences are interpreted by terminal emulators. C will convert the octal/hexadecimal escape sequences to the ASCII Esc character. Your compiler, as an extension, might also convert \e or \E. As requested, a brief explanation of what the control sequences are doing:
[0m: resets all the SGR attributes
[?25l: hides the cursor
[2J: from Wikipedia:
Clears part of the screen. If n is 0 (or missing), clear from cursor
to end of screen. If n is 1, clear from cursor to beginning of the
screen. If n is 2, clear entire screen ...

The printf() call is simply outputting a specific series of byte values. The "magic" is that those values are special in the terminal.
A special series of bytes starting with the ASCII "escape" character is called an "escape sequence". These were invented for serial data terminals, where the only means of communication with the terminal was by sending byte values through the serial connection. Ordinary characters are simply displayed on the terminal, but it was desirable to have a way to move the cursor, clear the screen, etc. and most terminals used escape sequences for this.
http://en.wikipedia.org/wiki/Escape_sequence
There was one particularly popular terminal called the "VT100", and most terminal emulators today operate using VT100 escape sequences.
Even today, escape sequences are useful. You can write a simple C program that will work on the terminal emulators in Linux, Mac, Windows, mobile devices, basically everywhere. When you need to do something simple like clear the screen, just outputting the proper escape sequence is the easiest way.

Related

What's the difference between putch() and putchar()?

Okay so, I'm pretty new to C.
I've been trying to figure out what exactly is the difference between putch() and putchar()?
I tried googling my answers but all I got was the same copy-pasted-like message that said:
putchar(): This function is used to print one character on the screen, and this may be any character from C character set (i.e it may be printable or non printable characters).
putch(): The putch() function is used to display all alphanumeric characters through the standard output device like monitor. this function display single character at a time.
As English isn't my first language I'm kinda lost. Are there non printable characters in C? If so, what are they? And why can't putch produce the same results?
I've tried googling the C character set and all of the alphanumeric characters there are, but as much as my testing went, there wasn't really anything that one function could print and the other couldn't.
Anyways, I'm kind of lost.
Could anyone help me out? thanks!
TLDR;
what can putchar() do that putch() can't? (or the opposite or something idk)
dunno, hoped there would be a visible difference between the two but can't seem to find it.
putchar() is a standard function, possibly implemented as a macro, defined in <stdio.h> to output a single byte to the standard output stream (stdout).
putch() is a non standard function available on some legacy systems, originally implemented on MS/DOS as a way to output characters to the screen directly, bypassing the standard streams buffering scheme. This function is mostly obsolete, don't use it.
Output via stdout to a terminal is line buffered by default on most systems, so as long as your output ends with a newline, it will appear on the screen without further delay. If you need output to be flushed in the absence of a newline, use fflush(stdout) to force the stream buffered contents to be written to the terminal.
putch replaces the newline character (\ n)
putchar is a function in the C programming language that writes a single character to the standard output

Why can't we print ASCII values from 0 to 31?

#include<stdio.h>
int main()
{
for(int i=0;i<=31;i++)
printf("%c",i);
}
when we try to run this code then nothing prints
what is the reason for it ?
C is printing them, but perhaps your terminal is not displaying them. This distinction is important because the terminal is responsible for interpreting the output of your program, printing letters, moving the cursor around, changing colors and such.
By historical convention the first 32 characters of the ASCII table are considered "control characters", some of which are printable, some like backspace which move the cursor, others like BEL which can make your terminal beep.
Different terminals may display these differently, or not at all.
It's worth noting that ASCII pre-dates modern "glass" terminals and that these codes were used to move the print-head around on the page. Early machines used teletypes to communicate with them and a line-feed would crank down the paper one line, a carriage return move the cursor back to the start of the line, much like the physical carriage return on a typewriter which would move the "carriage" back to the first column.
These were pretty elaborate elecromechanical contraptions that didn't have any modern circuitry in them, yet they could still process ASCII data, at least for those using ASCII, as there are other character sets like EBCDIC that co-existed with ASCII.
As these characters were never intended to be printed, so they don't have a standard visual representation in ASCII.
With "extended ASCII", as used in DOS, there are symbols defined for them because it seemed like a waste otherwise. These don't have control-code meanings, typically you write them directly to the console character buffer in order to see them.
You can, it's just that most of them are non-printable control characters that most shells ignore. If you pipe stdout to a file, the file will contain those characters, it's just the shell that doesn't know what to do with them. Some of them are handled by shells (e.g. the line feed and backspace characters) but others are just nonsensical (e.g. end of transmission, data link escape) and get ignored, or replaced with a different character for display (often a space or a question mark or the like).

In C, is \f same as \v in terminals now?

We know \f was used in C as form feed in earlier times. Still the escape sequence is for form feed. How does it behave in (modern) terminals now? It looks same as vertical space \v (rather than new line \n which we naturally expect).
Is \f the same as \v now when printed on screens?
Why is \f printing a vertical space on screens now? Instead of start of next page, isn't start of next line more natural?
How control characters render on a terminal depends on the terminal emulator, not the programming language you use. So you should search for documentation about your particular system.
For example, a Linux console responds to \v and \f in the same way:
LF (0x0A, ^J), VT (0x0B, ^K) and FF (0x0C, ^L) all give a linefeed, and if LF/NL (new-line mode) is set also a carriage return;
(quoted from man 4 console_codes)
Note that despite the above statement, the character 0x0A (\n) will echo as a CR-LF sequence because of the underlying terminal default onlcr setting, which causes newlines to be automatically translated to a CR-NL sequence. See man stty.
When line printers tended to use the green and white striped 11" x 14" fan fold paper, the equipment usually set VT to advance to the next lower inch, as HT would advance to the next multiple of 8 column, so applications could get output aligned to the top of one of those stripes without the extra code of counting lines per page to add individual LF codes manually, or possibly wasting a lot of paper if only FF was available. This was generalized in ISO6429 with the VTS code and other sequences. Using a code like this let the equipment speed up the paper advance over line at a time also, as it knew the paper didn't have to be stationary to print a possible graphic in the lines skipped.
C handles them transparently (usually with the ASCII code on nearly all not very ancient compilers).
Terminals handles the ASCII control code. As in the answer of #rici, on consoles usually they are equivalent, but terminals are a wide concept. \v is used for printers (especially on old printers where one sent directly characters to printers [also now it is so, but sending Postscript code (also in ASCII), and never the old plain text + formatting sequences]), e.g. headers and footers. Also \f makes a lot more sense on printer: new page.
The \v and \f are very seldom used. According The C Book (Annotated Reference), Table 866.2, \v is used 0.31% of all escape sequences, and \f 0.44%.
BTW \f (^L) is more often found on C code (as ASCII character) then as escape character in C code. Some editors put it to split regions (e.g. for hiding/showing just some regions). But also this (in my experience) is fading out: editors are now smarter, so they could automatically select regions (functions, declarations, etc. levels, including also with any sort of documentation convention [in comments] just before declaration/definitions]).
Note: some escape sequences are also used on communication between programs, to define "end of record (but not of communication).

What does printf("\033c" ) mean?

I was looking for a way to "reset" my Unix terminal window after closing my program, and stumbled upon printf("\033c" ); which works perfectly, but I just can't understand it. I went to man console_codes and since I'm somewhat inexperienced with Unix c programming, it wasn't very helpful.
Could someone explain printf("\033c" );?
In C numbers starting with a leading zero are octal numbers. Numbers in base 8.
What it does is print the character represented by octal number 33 followed by a 'c'.
In ASCII encoding the octal number 33 is the ESC (escape) character, which is a common prefix for terminal control sequences.
With that knowledge searching for terminal control sequences we can find e.g. this VT100 control sequence reference (VT100 was an old "dumb" terminal, and is emulated by most modern terminal programs). Using the VT100 reference we find <ESC>c in the terminal setup section, where it's documented as
Reset Device <ESC>c
Reset all terminal settings to default.
The ESC character could also be printed using "\x1b" (still assuming ASCII encoding). There is no way to use decimal numbers in constant string literals, only octal and hexadecimal.
However (as noted by the comment by chux) the sequence "\x1bc" will not do the same as "\033c". That's because 0x1bc is a valid hexadecimal number, and the compiler is greedy when it parses such sequences. It will print the character represented by the value 0x1bc instead, and I have no idea what it might be (depends on locale and terminal settings I suppose, might be printed as a Unicode character).
That's an escape sequence used to reset a DEC VT100 (or compatible) terminal. Some terminals (such as Linux console) accept VT100-style escape sequences, even when they are not actually VT100s.
The \033 is the ASCII escape character, which begins these sequences. Most are followed by another special character (this is a rare exception). XTerm Control Sequences lists that, along with others that are not followed by a special character.
In ECMA-48 it is possible to use a different character for the usual case, e.g., [ for the *control sequence initiator.
Resetting a real VT100 (in contrast to a terminal emulator) does more than clear the screen, as noted in Debian Bug report logs - #60377
"reset" broken for dumb terminals, but users of terminal emulators tend to assume it is a short way to clear the screen. The standard way would be something like this:
printf("\033[H\033[J");
The ncurses FAQ Why does reset log me out? addresses that issue.
Incidentally, users of terminal emulators also get other issues with the terminal confused. The ncurses FAQ How do I get color with VT100? addresses one of those.
It clears the screen in Linux type operating systems (ubuntu, fedora etc...).
You can check here on asciitable.com, under octal 33 (decimal 27) you have ESC character.

Carriage return required when printing to the console in Windows?

It seems like just putting a linefeed is good enough, but I know it is supposed to be carriage return + line feed. Does anything horrible happen if you don't put the carriage return and only use line feeds?
This is in ANSI C and not going to be redirected to a file or anything else. Just a normal console app.
The Windows console follows the same line ending convention that is assumed for files, or for that matter for actual, physical terminals. It needs to see both CR and LF to properly move to the next line.
That said, there is a lot of software infrastructure between an ANSI C program and that console. In particular, any standard C library I/O function is going to try to do the right thing, assuming you've allowed it the chance. This is why fopen()'s t and b modifiers for the mode parameter were defined.
With t (the default for most streams, and in particular for stdin and stdout) then any \n printed is converted to a CRLF sequence, and the reverse happens for reads. To turn off that behavior, use the b modifier.
Incidentally, the terminals traditionally hooked to *nix boxes including the DEC VT100 emulated by XTerm also needs both CR and LF. However, in the *nix world, the conversion from a newline character to a CRLF sequence is handled in the tty device driver so most programs don't need to know about it, and the t and b modifiers are both ignored. On those platforms, if you need to send and receive characters on a tty without that modification, you need to look up stty(1) or the system calls it depends on.
If your otherwise ANSI C program is avoiding C library I/O to the console (perhaps because you need access to the console's character color and other attributes) then whether you need to send CR or not will depend on which Win32 API calls you are using to send the characters.
If you're in a *nix environment \n (Linefeed) is probably ok. If you're in Windows and aren't redirecting (now) a linefeed is also ok, but if someone at somepoint redirects, :-(
If you're doing Windows though, there could be issues if the output is redirected to a text file and then another process tries to consume the data.
The console knows what to show, but consumers might not be happy...
If you are using C# You might try the Environment.NewLine "constant".
http://msdn.microsoft.com/en-us/library/system.environment.newline.aspx
If you're really in vanilla c, you're stuck with \r\n. :-)
It depends on what you're using them for. Some programs will not display newlines properly if you don't put both \r and \n.
If you try to only write \n some programs that consume your text file (or output) may display your text as a single line instead of multiple lines.
There are also some file formats and protocols that will completely be invalid without using both \r and \n.
I haven't tried it in so long that I'm not sure I remember what happens... but doesn't a linefeed by itself move down a line without returning to the left column?
Depending on your compiler, the standard output might be opened in text mode, in which case a single linefeed will be translated to \r\n before being written out.
Edit: I just tried a quick test, and in XP a file without returns displays normally. I still don't know if any compilers insert the returns for you.
In C, files (called "streams") come in two flavors - binary or text.
The meaning of this distinction is left implementation/platform dependent, but on Windows (with common implementations that I've seen) when writing to text streams '\n' is automatically translated to "\r\n", and when reading from text streams "\r\n" is automatically translated to '\n'.
The "console" is actually "standard output", which is a stream opened by default as a text stream. So, in practice on Windows, writing "Hello, world!\n" should be quite sufficient - and portable.

Resources