Why can't we print ASCII values from 0 to 31? - c

#include<stdio.h>
int main()
{
for(int i=0;i<=31;i++)
printf("%c",i);
}
when we try to run this code then nothing prints
what is the reason for it ?

C is printing them, but perhaps your terminal is not displaying them. This distinction is important because the terminal is responsible for interpreting the output of your program, printing letters, moving the cursor around, changing colors and such.
By historical convention the first 32 characters of the ASCII table are considered "control characters", some of which are printable, some like backspace which move the cursor, others like BEL which can make your terminal beep.
Different terminals may display these differently, or not at all.
It's worth noting that ASCII pre-dates modern "glass" terminals and that these codes were used to move the print-head around on the page. Early machines used teletypes to communicate with them and a line-feed would crank down the paper one line, a carriage return move the cursor back to the start of the line, much like the physical carriage return on a typewriter which would move the "carriage" back to the first column.
These were pretty elaborate elecromechanical contraptions that didn't have any modern circuitry in them, yet they could still process ASCII data, at least for those using ASCII, as there are other character sets like EBCDIC that co-existed with ASCII.
As these characters were never intended to be printed, so they don't have a standard visual representation in ASCII.
With "extended ASCII", as used in DOS, there are symbols defined for them because it seemed like a waste otherwise. These don't have control-code meanings, typically you write them directly to the console character buffer in order to see them.

You can, it's just that most of them are non-printable control characters that most shells ignore. If you pipe stdout to a file, the file will contain those characters, it's just the shell that doesn't know what to do with them. Some of them are handled by shells (e.g. the line feed and backspace characters) but others are just nonsensical (e.g. end of transmission, data link escape) and get ignored, or replaced with a different character for display (often a space or a question mark or the like).

Related

What's the difference between putch() and putchar()?

Okay so, I'm pretty new to C.
I've been trying to figure out what exactly is the difference between putch() and putchar()?
I tried googling my answers but all I got was the same copy-pasted-like message that said:
putchar(): This function is used to print one character on the screen, and this may be any character from C character set (i.e it may be printable or non printable characters).
putch(): The putch() function is used to display all alphanumeric characters through the standard output device like monitor. this function display single character at a time.
As English isn't my first language I'm kinda lost. Are there non printable characters in C? If so, what are they? And why can't putch produce the same results?
I've tried googling the C character set and all of the alphanumeric characters there are, but as much as my testing went, there wasn't really anything that one function could print and the other couldn't.
Anyways, I'm kind of lost.
Could anyone help me out? thanks!
TLDR;
what can putchar() do that putch() can't? (or the opposite or something idk)
dunno, hoped there would be a visible difference between the two but can't seem to find it.
putchar() is a standard function, possibly implemented as a macro, defined in <stdio.h> to output a single byte to the standard output stream (stdout).
putch() is a non standard function available on some legacy systems, originally implemented on MS/DOS as a way to output characters to the screen directly, bypassing the standard streams buffering scheme. This function is mostly obsolete, don't use it.
Output via stdout to a terminal is line buffered by default on most systems, so as long as your output ends with a newline, it will appear on the screen without further delay. If you need output to be flushed in the absence of a newline, use fflush(stdout) to force the stream buffered contents to be written to the terminal.
putch replaces the newline character (\ n)
putchar is a function in the C programming language that writes a single character to the standard output

Can backspace escape cancel a new-line escape?

I'm working with ubuntu.
Code:
printf("Hello\n\b world");
I get on terminal:
Hello
world
Why does backspace not cancel the \n?
Is there a hierarchy in chars?
How can I delete special chars?
Your question goes beyond the scope of the C language: printf("Hello\n\b world"); outputs the bytes from the format string, possibly translated according to the text mode handling of newlines:
on unix systems, the bytes are output to the system handle unmodified.
on Microsoft legacy systems, the newline is converted to CR LF and the other bytes transmitted unmodified.
If the standard output is directed to a file, the file will contain the translation of the newline and a backspace (0x08 on most systems).
If the standard output goes to a terminal, the handling of the backspace special character is outside the program's control: the terminal (hardware, virtual, local or remote...) will perform its task as programmed and configured... Most terminals move the cursor left one position on whatever display they control, some erase the character at that position. If the cursor is already at column 1, it is again system dependent whether backspace moves the cursor back to the end of the previous line, whatever that means. Many systems don't do that and keep the cursor at column 1. This seems consistent with the behavior you observe.
This is what the C standard says (in C 2018 5.2.2 2) about the new line character:
Moves the active position to the initial position of the next line.
and backspace:
Moves the active position to the previous position on the current line. If the active position is at the initial position of a line, the behavior of the display device is unspecified.
Note that the backspace character is not specified to erase a previous character. It is specified to cause a certain action on a display device.
Recall that C was developed in an era when teletypes and other physical printing devices were in common use. Many of these devices could only push the paper upward. Once a new line character caused the paper to be pushed upward, there was no way to move it downward again.
Additionally, some early video displays, or the software driving them, emulated physical printing and did not support going back a line, at least in some of their modes of operation.
On displays where one could move the cursor freely, it is not clear what a backspace from the beginning of a line should do. Consider a display which has 80 columns, numbered from 1 to 80, and the last line printed contained 40 characters, followed by a new line. When we backspace, we move the cursor back to that line, but which column do we move it to? Column 80, the last one of the display? Or column 40, the last one where something was printed? Different devices might handle this differently. Note that the latter choice requires the device to remember the length of each line, an added burden on early computing machinery. (My high school’s cheap display terminals did not have enough memory to remember all the text in a 24×80 display. I think it was only 1024 bytes, enough for 12.8 lines of 80 characters. If you wrote complete lines of text, it would scroll earlier lines off the display, keeping only the last 12.)
Because of these variations in behavior, the C standard did not specify the details of backspacing from the start of a line.
You ask about a “backspace escape” canceling a “new-line escape.” However, the escape sequences are irrelevant here; they are in a different layer of representation than the operations of the characters:
Inside a string literal, \b and \n are escape sequences. As the compiler translates the program, it replaces these with a backspace character and a new line character. Then they are no longer escape sequences; they are simply characters in a string.
When you write the characters with printf, they are transmitted as characters in a stream.
When the characters are sent to a display device (because that is what the stream is connected to), they produce the actions in the 5.2.2 2 text cited above.
Those escape sequences \b and \n represent control characters. A control character is a special character that, well, controls the behavior of the output device in some special way. When you say
printf("A");
it prints the (ordinary) character A to the screen. But when you say
printf("\n");
it doesn't print anything, instead it moves the cursor down to the beginning of the next line.
Now, the meaning of \b is not "cancel the character to the left". The control character \b does not "cancel" anything. What it does is just move the cursor one character to the left, if it can. But if the cursor is already at the left edge, it probably can't.
Once upon a time, and especially when the output was going to a printer that actually printed on paper, it was common to do things like
printf("this is u\b_n\b_d\b_e\b_r\b_l\b_i\b_n\b_e\b_d\b_\n");
or
printf("this is b\bbo\bol\bld\bd\n");
to print underlined or bold words by overprinting. These examples obviously rely on the move-one-to-the-left behavior of \b. These examples prove that the behavior of \b is not anything like "canceling"!
It sounds like you think \b might somehow affect the string it's part of.
It sounds like you think \b might somehow be processed by your C compiler, or by the C library.
It sounds like you think that the string "abc\bdef" might get converted to "abdef".
But none of these things is true. The backspace character \b is interpreted by your screen or your printer, or whatever output device your program is "printing" to. The interpretation of control characters like \b is mostly up to your output device. It is mostly not a property of the C programming language.

Confused with interesting printf() statement

By reading this code, I stumbled upon the following printf() statement:
// reset, hide cursor and clear screen
printf("\e[0m\e[?25l\e[2J");
I must admit that I am not a fully qualified C hacker and do not fully understand this. I tweaked around, removing the arguments, and I understand what it does (well, the comment actually says it all), but I have no idea how it's done. Also, this is something kind of hard to google for.
How does this printf() call work?
This doesn't really have anything to do with printf. The C11 standard lists escape sequences in §5.2.2, and the list consists of \a, \b, \f, \n, \r, \t and \v. As an extension, GCC considers \e to be an escape sequence which stands for the ASCII character Esc (\E may work as well, or your compiler may support neither of them. Consult the documentation for your compiler). What follows are non-portable control sequences. They are not guaranteed to work the same in all terminals, or even work at all. The best way to know is to consult the documentation for your system.
§6.4.4.4 also describes octal escape sequences. For example, \033, where 033 is 27 in decimal, and therefore the escape character in ASCII. Similarly, you can use \x1b, which is a hexadecimal escape sequence specifying the same character.
If we inspect the output of the program with od -c, it shows 033.
(✿´‿`) ~/test> ./a.out | od -c
0000000 033 [ 0 m 033 [ ? 2 5 l 033 [ 2 J
0000016
The ANSI escape sequences are interpreted by terminal emulators. C will convert the octal/hexadecimal escape sequences to the ASCII Esc character. Your compiler, as an extension, might also convert \e or \E. As requested, a brief explanation of what the control sequences are doing:
[0m: resets all the SGR attributes
[?25l: hides the cursor
[2J: from Wikipedia:
Clears part of the screen. If n is 0 (or missing), clear from cursor
to end of screen. If n is 1, clear from cursor to beginning of the
screen. If n is 2, clear entire screen ...
The printf() call is simply outputting a specific series of byte values. The "magic" is that those values are special in the terminal.
A special series of bytes starting with the ASCII "escape" character is called an "escape sequence". These were invented for serial data terminals, where the only means of communication with the terminal was by sending byte values through the serial connection. Ordinary characters are simply displayed on the terminal, but it was desirable to have a way to move the cursor, clear the screen, etc. and most terminals used escape sequences for this.
http://en.wikipedia.org/wiki/Escape_sequence
There was one particularly popular terminal called the "VT100", and most terminal emulators today operate using VT100 escape sequences.
Even today, escape sequences are useful. You can write a simple C program that will work on the terminal emulators in Linux, Mac, Windows, mobile devices, basically everywhere. When you need to do something simple like clear the screen, just outputting the proper escape sequence is the easiest way.

Is \n multi-character in C?

I read that \n consists of CR & LF. Each has their own ASCII codes.
So is the \n in C represented by a single character or is it multi-character?
Edit: Kindly specify your answer, rather than simply saying "yes, it is" or "no, it isn't"
In a C program, it's a single character, '\n'representing end of line. However, some operating systems (most notably Microsoft Windows) use two characters to represent end of line in text files, and this is likely where the confusion comes from.
It's the responsibility of the C I/O functions to do the conversions between the C representation of '\n' and whatever the OS uses.
In C programs, simply use '\n'. It is guaranteed to be correct. When looking at text files with some sort of editor, you might see two characters. When a text file is transferred from Windows to some Unix-based system, you might get "^M" showing up at the end of each line, which is annoying, but has nothing to do with C.
Generally: '\n' is a single character, which represents a newline. '\r' is a single character, which represents a carriage-return. They are their own independent ASCII characters.
Issues arise because in the actual file representation, UNIX-based systems tend to use '\n' alone to represent what you think of when you hit "enter" or "return" on the keyboard, whereas Windows uses a '\r' followed directly by a '\n'.
In a file:
"This is my UNIX file\nwhich spans two lines"
"This is my Windows file\r\nwhich spans two lines"
Of course, like all binary data, these characters are all about interpretation, and that interpretation depends on the application using the data. Stick to '\n' when you are making C-strings, unless you want a literal carriage-return, because as people have pointed out in the comments, the OS representation doesn't concern you. IO libraries, including C's, are supposed to handle this themselves and abstract it away from you.
For your curiosity, in decimal, '\n' in ASCII is 10, '\r' is 13, but note that this is the ASCII standard, not a C standard.
It depends:
'\n' is a single character (ASCII LF)
"\n" is a '\n' character followed by a 0 terminator
some I/O operations transform a '\n' into '\r\n' on some systems (CR-LF).
When you print the \n to a file, using the windows C stdio libraries, the library interprets that as a logical new-line, not the literal character 0x0A. The output to the file will be the windows version of a new-line: 0x0D0A (\r\n).
Writing
Sample code:
#include <stdio.h>
int main() {
FILE *f = fopen("foo.txt","w");
fprintf(f,"foo\nbar");
return 0;
}
A quick cl /EHsc foo.c later and you get
0x666F6F 0x0D0A 0x626172 (separated for convenience)
in foo.txt under a hex editor.
It's important to note that this translation DOES NOT occur if you are writing to a file in 'binary mode'.
Reading
If you are reading the file back in using the same tools, also on windows, the "windows EOL" will be interpreted properly if you try to match up against \n.
When reading it back
#include <stdio.h>
int main() {
FILE *f = fopen("foo.txt", "r");
char c;
while (EOF != fscanf(f, "%c", &c))
printf("%x-", c);
}
You get
66-6f-6f-a-62-61-72-
Therefore, the only time this should be relevant to you is if you are
Moving files back and forth between mac/unix and windows. Unix needs no real explanation here, since \n directly translates to 0x0A on those platforms. (pre-OSX \n was 0x0D on mac iirc)
Putting text in binary files, only do this carefully please
Trying to figure out why your binary data is being messed up when you opened the file "w", instead of "wb"
Estimating something important based on the size of the file, on windows you'll have an extra byte per newline.
\n is a new-line -- it's a logical representation of whatever separates one line from another in a text file.
A given platform will have some physical representation of that logical separation between lines. On Unix and most similar systems, the new-line is represented by a line-feed (LF) character (and since Unix was/is so closely associated with C, on Unix the LF is often just called a new-line). On MacOS, it's typically represented by a carriage-return (CR). On a fair number of other systems, most prominently Windows, it's represented by a carriage return/line feed pair -- normally in that order, though once in a while you see something use LF followed by CR (as I recall, Clarion used to do that).
In theory, a new-line doesn't need to correspond to any characters in the stream at all though. For example, a system could have text files that were stored as a length followed by the appropriate number of characters. In such a case, the run-time library would need to carry out a slightly more extensive translation between internal and external representations of text files than is now common, but such is life.
According to the C99 Standard (section 5.2.2),
\n "moves the active position [where the next character from fputc would appear] to the initial position on the next line".
Also
[\n] shall produce a unique implementation-defined value
which can be stored in a single char object. The external representations in a text file
need not be identical to the internal representations and are outside the scope of [the C99 Standard]
Most C implementations choose to define \n as ASCII line feed (0x0A) for historical reasons. However, on many computer operating systems, the sequence for moving the active position to the beginning of the next line requires two characters usually 0x0D, 0x0A. So, when writing to a text file, the C implementation must convert the internal sequence of 0x0A to the external one of 0x0D, 0x0A. How this is done is outside of the scope of the C standard, but usually, the file IO library will perform the conversion on any file opened in text mode.
Your question is about text files.
A text file is a sequence of lines.
A line is a sequence of characters ending in (and including) a line break.
A line breaks is represented differently by different Operating Systems.
On Unix/Linux/Mac they are usually represented by a single LINEFEED
On Windows they are usually represented by the pair CARRIAGE RETURN + LINEFEED
On old Macs they were usually represented by a single CARRIAGE RETURN
On other systems (AS/400 ??) there may even not be a specific character that represents a line break ...
Anyway, the library code in C is responsible to translating the system's line break to '\n' when reading text files and do the reverse operation when writing text files.
So, no matter what the representation is on any given system, when you read a text file in C, lines will be ended by a '\n'.
Note: The '\n' is not necessarily 0x0a in all systems.
Yes it is.
\n is a newline. Hex code is 0x0A.
\r is a carriage return. Hex code is 0x0D
It is a single character. It represents Newline (but is not the only representation - Wikipedia).
EDIT: The question was changed while I was typing the answer.

Carriage return required when printing to the console in Windows?

It seems like just putting a linefeed is good enough, but I know it is supposed to be carriage return + line feed. Does anything horrible happen if you don't put the carriage return and only use line feeds?
This is in ANSI C and not going to be redirected to a file or anything else. Just a normal console app.
The Windows console follows the same line ending convention that is assumed for files, or for that matter for actual, physical terminals. It needs to see both CR and LF to properly move to the next line.
That said, there is a lot of software infrastructure between an ANSI C program and that console. In particular, any standard C library I/O function is going to try to do the right thing, assuming you've allowed it the chance. This is why fopen()'s t and b modifiers for the mode parameter were defined.
With t (the default for most streams, and in particular for stdin and stdout) then any \n printed is converted to a CRLF sequence, and the reverse happens for reads. To turn off that behavior, use the b modifier.
Incidentally, the terminals traditionally hooked to *nix boxes including the DEC VT100 emulated by XTerm also needs both CR and LF. However, in the *nix world, the conversion from a newline character to a CRLF sequence is handled in the tty device driver so most programs don't need to know about it, and the t and b modifiers are both ignored. On those platforms, if you need to send and receive characters on a tty without that modification, you need to look up stty(1) or the system calls it depends on.
If your otherwise ANSI C program is avoiding C library I/O to the console (perhaps because you need access to the console's character color and other attributes) then whether you need to send CR or not will depend on which Win32 API calls you are using to send the characters.
If you're in a *nix environment \n (Linefeed) is probably ok. If you're in Windows and aren't redirecting (now) a linefeed is also ok, but if someone at somepoint redirects, :-(
If you're doing Windows though, there could be issues if the output is redirected to a text file and then another process tries to consume the data.
The console knows what to show, but consumers might not be happy...
If you are using C# You might try the Environment.NewLine "constant".
http://msdn.microsoft.com/en-us/library/system.environment.newline.aspx
If you're really in vanilla c, you're stuck with \r\n. :-)
It depends on what you're using them for. Some programs will not display newlines properly if you don't put both \r and \n.
If you try to only write \n some programs that consume your text file (or output) may display your text as a single line instead of multiple lines.
There are also some file formats and protocols that will completely be invalid without using both \r and \n.
I haven't tried it in so long that I'm not sure I remember what happens... but doesn't a linefeed by itself move down a line without returning to the left column?
Depending on your compiler, the standard output might be opened in text mode, in which case a single linefeed will be translated to \r\n before being written out.
Edit: I just tried a quick test, and in XP a file without returns displays normally. I still don't know if any compilers insert the returns for you.
In C, files (called "streams") come in two flavors - binary or text.
The meaning of this distinction is left implementation/platform dependent, but on Windows (with common implementations that I've seen) when writing to text streams '\n' is automatically translated to "\r\n", and when reading from text streams "\r\n" is automatically translated to '\n'.
The "console" is actually "standard output", which is a stream opened by default as a text stream. So, in practice on Windows, writing "Hello, world!\n" should be quite sufficient - and portable.

Resources