Newline character in C other than \n? [closed] - c

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
Is there a replacement of \n in C? Can I jump to next line without using \n? I came across this question and I cannot seem to figure out a way..
#include < stdio.h>
void main()
{
printf("A");
printf("B");
}
I want it to print
A
B
And not
AB
But I cannot use \n.

You can use puts("A"); puts("B");.

To output \n without writing \n you could simply printf("%c",10);, or char c=10;write(1,&c,1);, or 100 other ways to output ascii 10.
But if you would like to avoid the newline character completely, the answer varies. It's not C or your code that actually decides to go to a new line, but whatever the device is that displays your output.
E.g. a terminal, or a line printer, or a web browser showing html.
Suppose you are outputting to a browser, the answer would be <br>
If you are outputting to an xterm terminal, \033[1B moves the cursor one line down. The output a carriage return \r to move to the beginning of the line.
So for example:
printf("test\033[1B\r123");
will output (in an xterm)
test
123

As an alternative to puts(), you can use the octal equivalent of \n: \012.
E.g.
#include <stdio.h>
int main()
{
printf("A\012");
printf("B\012");
return 0;
}

\n means "new line". \rmeans "carriage return". These characters have their origin from old form-feed printers, which needed both, one to advance the paper to a new line, one to return the carriage with the print head to the start of the line.
Different OS-es have uses different combinations of the two to indicate a new line in text:
Unix (and Linux, Mac OSX, etc) uses only \n
Windows uses \r\n
Mac OS up to version 9 uses only \r
There is no canonical platform-independent way in the C standard libraries to represent the "System"'s newline/carriage return character. However, many other languages and/or libraries have implemented this, e.g:
Java: System.getProperty("line.separator")
.NET: System.Environment.NewLine
However, if you have an output stream open on a text file, the standard C libraries should translate \n to whatever is necessary to indicate a new line on the current system.

I'd simply use:
printf("A%cB",10);

In C the escape sequence \n represents a character that belongs to the basic execution character set.
In particular, one can be sure that a new-line character exists when our C programs are running.
More important: \n is mapped to 1 and only 1 character, whose code is a positive integer number in the range of the type char.
However, the behaviour of the display devices can produce other results.
For example, when we send a character \n to a text file under Windows, this is replaced by the sequence of two characters \x0D\x0A (LF+CR, that is: Line Feed + Carriage Return).
The standard C99 or C11 says:
(5.2.2) \n (new line) Moves the active position to the initial position of the next line.
The meaning of that (in every system that prints several lines in its standard display device), the "effect" of sending a character \n is "advance to the next line" and "go to the beginning of that line". It is the sum of "line feed" and "carriage return" operations.
In DOS/Windows this is equivalent to send the sequence of these two characters: '\xD', '\xA'.
In Linux/Unix this is equivalent to send the only character '\xD'.
More details here: http://en.wikipedia.org/wiki/Newline
Summarizing, we have to distinguish betwenn the character \n by itself, and the semantic related to \n when is considered as a control character in C.
You can "produce" a new line just by sending the character '\n' with putc() or printf().
By using \n is the better way to do that.
Another very good alternative is that pointed out by #1'': to use the function puts(str).
This function appends a new-line at the end of the string str.
I think is not at all a good idea to try this other alternatives:
printf("%c", 10); // 10 == 0x0A
printf("%c", 13); // 10 == 0x0D
printf("\x0A"); // CR
printf("\x0D"); // LF
You are not solving the problem properly, because your program becomes system dependent.
Valid for Windows, Linux or what?
Worst: in theory, the standard C does not ensures that you even have correspondence between the ASCII/Unicode code numbers for the characters and the characters used in the system your program will run. This issue involves to the control characters, too.
So, you cannot be sure that 0x10 means "carriage return" and 0x13 means "line feed".
To be sure of that, it is necessary to check the existence of the following macro:
__STDC_ISO_10646__
(That macro, if exists, is a long int number containing information about the version of Unicode supported by your compiler.)
The important thing is: if the macro __STDC_ISO_10646__ is not defined, you cannot have certainty about the codes assigned to the characters in your system.
Thus, it cannot be used the "magic numbers" 10 and 13.

Related

What's the difference between putch() and putchar()?

Okay so, I'm pretty new to C.
I've been trying to figure out what exactly is the difference between putch() and putchar()?
I tried googling my answers but all I got was the same copy-pasted-like message that said:
putchar(): This function is used to print one character on the screen, and this may be any character from C character set (i.e it may be printable or non printable characters).
putch(): The putch() function is used to display all alphanumeric characters through the standard output device like monitor. this function display single character at a time.
As English isn't my first language I'm kinda lost. Are there non printable characters in C? If so, what are they? And why can't putch produce the same results?
I've tried googling the C character set and all of the alphanumeric characters there are, but as much as my testing went, there wasn't really anything that one function could print and the other couldn't.
Anyways, I'm kind of lost.
Could anyone help me out? thanks!
TLDR;
what can putchar() do that putch() can't? (or the opposite or something idk)
dunno, hoped there would be a visible difference between the two but can't seem to find it.
putchar() is a standard function, possibly implemented as a macro, defined in <stdio.h> to output a single byte to the standard output stream (stdout).
putch() is a non standard function available on some legacy systems, originally implemented on MS/DOS as a way to output characters to the screen directly, bypassing the standard streams buffering scheme. This function is mostly obsolete, don't use it.
Output via stdout to a terminal is line buffered by default on most systems, so as long as your output ends with a newline, it will appear on the screen without further delay. If you need output to be flushed in the absence of a newline, use fflush(stdout) to force the stream buffered contents to be written to the terminal.
putch replaces the newline character (\ n)
putchar is a function in the C programming language that writes a single character to the standard output

Can backspace escape cancel a new-line escape?

I'm working with ubuntu.
Code:
printf("Hello\n\b world");
I get on terminal:
Hello
world
Why does backspace not cancel the \n?
Is there a hierarchy in chars?
How can I delete special chars?
Your question goes beyond the scope of the C language: printf("Hello\n\b world"); outputs the bytes from the format string, possibly translated according to the text mode handling of newlines:
on unix systems, the bytes are output to the system handle unmodified.
on Microsoft legacy systems, the newline is converted to CR LF and the other bytes transmitted unmodified.
If the standard output is directed to a file, the file will contain the translation of the newline and a backspace (0x08 on most systems).
If the standard output goes to a terminal, the handling of the backspace special character is outside the program's control: the terminal (hardware, virtual, local or remote...) will perform its task as programmed and configured... Most terminals move the cursor left one position on whatever display they control, some erase the character at that position. If the cursor is already at column 1, it is again system dependent whether backspace moves the cursor back to the end of the previous line, whatever that means. Many systems don't do that and keep the cursor at column 1. This seems consistent with the behavior you observe.
This is what the C standard says (in C 2018 5.2.2 2) about the new line character:
Moves the active position to the initial position of the next line.
and backspace:
Moves the active position to the previous position on the current line. If the active position is at the initial position of a line, the behavior of the display device is unspecified.
Note that the backspace character is not specified to erase a previous character. It is specified to cause a certain action on a display device.
Recall that C was developed in an era when teletypes and other physical printing devices were in common use. Many of these devices could only push the paper upward. Once a new line character caused the paper to be pushed upward, there was no way to move it downward again.
Additionally, some early video displays, or the software driving them, emulated physical printing and did not support going back a line, at least in some of their modes of operation.
On displays where one could move the cursor freely, it is not clear what a backspace from the beginning of a line should do. Consider a display which has 80 columns, numbered from 1 to 80, and the last line printed contained 40 characters, followed by a new line. When we backspace, we move the cursor back to that line, but which column do we move it to? Column 80, the last one of the display? Or column 40, the last one where something was printed? Different devices might handle this differently. Note that the latter choice requires the device to remember the length of each line, an added burden on early computing machinery. (My high school’s cheap display terminals did not have enough memory to remember all the text in a 24×80 display. I think it was only 1024 bytes, enough for 12.8 lines of 80 characters. If you wrote complete lines of text, it would scroll earlier lines off the display, keeping only the last 12.)
Because of these variations in behavior, the C standard did not specify the details of backspacing from the start of a line.
You ask about a “backspace escape” canceling a “new-line escape.” However, the escape sequences are irrelevant here; they are in a different layer of representation than the operations of the characters:
Inside a string literal, \b and \n are escape sequences. As the compiler translates the program, it replaces these with a backspace character and a new line character. Then they are no longer escape sequences; they are simply characters in a string.
When you write the characters with printf, they are transmitted as characters in a stream.
When the characters are sent to a display device (because that is what the stream is connected to), they produce the actions in the 5.2.2 2 text cited above.
Those escape sequences \b and \n represent control characters. A control character is a special character that, well, controls the behavior of the output device in some special way. When you say
printf("A");
it prints the (ordinary) character A to the screen. But when you say
printf("\n");
it doesn't print anything, instead it moves the cursor down to the beginning of the next line.
Now, the meaning of \b is not "cancel the character to the left". The control character \b does not "cancel" anything. What it does is just move the cursor one character to the left, if it can. But if the cursor is already at the left edge, it probably can't.
Once upon a time, and especially when the output was going to a printer that actually printed on paper, it was common to do things like
printf("this is u\b_n\b_d\b_e\b_r\b_l\b_i\b_n\b_e\b_d\b_\n");
or
printf("this is b\bbo\bol\bld\bd\n");
to print underlined or bold words by overprinting. These examples obviously rely on the move-one-to-the-left behavior of \b. These examples prove that the behavior of \b is not anything like "canceling"!
It sounds like you think \b might somehow affect the string it's part of.
It sounds like you think \b might somehow be processed by your C compiler, or by the C library.
It sounds like you think that the string "abc\bdef" might get converted to "abdef".
But none of these things is true. The backspace character \b is interpreted by your screen or your printer, or whatever output device your program is "printing" to. The interpretation of control characters like \b is mostly up to your output device. It is mostly not a property of the C programming language.

Why can't we print ASCII values from 0 to 31?

#include<stdio.h>
int main()
{
for(int i=0;i<=31;i++)
printf("%c",i);
}
when we try to run this code then nothing prints
what is the reason for it ?
C is printing them, but perhaps your terminal is not displaying them. This distinction is important because the terminal is responsible for interpreting the output of your program, printing letters, moving the cursor around, changing colors and such.
By historical convention the first 32 characters of the ASCII table are considered "control characters", some of which are printable, some like backspace which move the cursor, others like BEL which can make your terminal beep.
Different terminals may display these differently, or not at all.
It's worth noting that ASCII pre-dates modern "glass" terminals and that these codes were used to move the print-head around on the page. Early machines used teletypes to communicate with them and a line-feed would crank down the paper one line, a carriage return move the cursor back to the start of the line, much like the physical carriage return on a typewriter which would move the "carriage" back to the first column.
These were pretty elaborate elecromechanical contraptions that didn't have any modern circuitry in them, yet they could still process ASCII data, at least for those using ASCII, as there are other character sets like EBCDIC that co-existed with ASCII.
As these characters were never intended to be printed, so they don't have a standard visual representation in ASCII.
With "extended ASCII", as used in DOS, there are symbols defined for them because it seemed like a waste otherwise. These don't have control-code meanings, typically you write them directly to the console character buffer in order to see them.
You can, it's just that most of them are non-printable control characters that most shells ignore. If you pipe stdout to a file, the file will contain those characters, it's just the shell that doesn't know what to do with them. Some of them are handled by shells (e.g. the line feed and backspace characters) but others are just nonsensical (e.g. end of transmission, data link escape) and get ignored, or replaced with a different character for display (often a space or a question mark or the like).

why when we write \n in the file it converts into \r\n combination?

I read this concept from book that when we attemp to write \n to the file using fputs(), fputs() converts the \n to \r\n combination and then if we read the same line back using fgets () the reverse conversion happens means \r\n back convert to \n.
I don't get that what is the purpose behind this?
It is because Windows (and MS-DOS) text files are supposed to have lines ending in \r\n, and portable C programs are supposed to simply use \n because C was originally defined on Unix.
And it's not just fputs and fgets that do it - any I/O function on a text file, even getc and fread, will do the same conversion.
Succinctly, DOS is the reason for this.
Different systems have different conventions for line endings. Unix reckons one character, '\n', is sufficient to mark the end of a line. DOS decided that it needed two characters, '\r' and '\n', though other systems also used that convention. The versions of Mac OS 1-9 (prior to Mac OS X) used just '\r' instead. Other systems could use a count and the line data instead of a line ending, or could simulate punched cards with blanks up to a fixed length (72 or 80). Unix also doesn't distinguish between binary and text files; DOS does. (DOS also uses Control-Z to mark EOF in a text file. Unix doesn't have an EOF marker; it knows exactly how big the file is and uses that length to determine when it has reached EOF.)
C originate on Unix, but to make it easier to migrate code between the systems, the standard I/O package defined that when it was working on text files, the input side would convert a native line ending to the single '\n' character for uniform input, and the output side would convert a '\n' to the native line ending.
However, the mention of text files also meant that there needed to be binary files, where these mappings do not occur.
You might note that most of the internet protocols (HTTP, for example) mandate CRLF (carriage return, line feed, or '\r', '\n') for the end of line markers.
(Actually, blaming DOS, as in MS-DOS or PC-DOS, is a little unfair. There were other systems that used the CRLF line end convention before DOS existed, and they may have been more influential on the Internet. However, almost all those ancestral systems are substantially defunct, and Windows is the environment that you'll run into these days where the distinction between binary and text files matters, and where you'll encounter CRLF line endings.)
Note that the C standard has this to say about text files:
ISO/IEC 9899:2011 §7.21.2 Files
¶2 A text stream is an ordered sequence of characters composed into lines, each line
consisting of zero or more characters plus a terminating new-line character. Whether the
last line requires a terminating new-line character is implementation-defined. Characters
may have to be added, altered, or deleted on input and output to conform to differing
conventions for representing text in the host environment. Thus, there need not be a one-to-one correspondence between the characters in a stream and those in the external
representation. Data read in from a text stream will necessarily compare equal to the data
that were earlier written out to that stream only if: the data consist only of printing
characters and the control characters horizontal tab and new-line; no new-line character is
immediately preceded by space characters; and the last character is a new-line character.
Whether space characters that are written out immediately before a new-line character
appear when read in is implementation-defined.
That's a lot of things that might or might not happen. Note, in particular, that trailing blanks written to a file might, or might not, appear in the input — according to the standard. That allows the systems that support punched card images or fixed length records to comply with the standard.
Note, too (as pointed out by Giacomo Degli Eposti), that this all means that if you open a file in binary mode that was originally written as a text file, you may very well get a significantly different list of bytes back from the I/O system. You'll see two characters per newline; you might see a Control-Z followed by other characters (possibly null bytes) up to a 'block' boundary that might be a multiple of 256 bytes, etc.

Is \n multi-character in C?

I read that \n consists of CR & LF. Each has their own ASCII codes.
So is the \n in C represented by a single character or is it multi-character?
Edit: Kindly specify your answer, rather than simply saying "yes, it is" or "no, it isn't"
In a C program, it's a single character, '\n'representing end of line. However, some operating systems (most notably Microsoft Windows) use two characters to represent end of line in text files, and this is likely where the confusion comes from.
It's the responsibility of the C I/O functions to do the conversions between the C representation of '\n' and whatever the OS uses.
In C programs, simply use '\n'. It is guaranteed to be correct. When looking at text files with some sort of editor, you might see two characters. When a text file is transferred from Windows to some Unix-based system, you might get "^M" showing up at the end of each line, which is annoying, but has nothing to do with C.
Generally: '\n' is a single character, which represents a newline. '\r' is a single character, which represents a carriage-return. They are their own independent ASCII characters.
Issues arise because in the actual file representation, UNIX-based systems tend to use '\n' alone to represent what you think of when you hit "enter" or "return" on the keyboard, whereas Windows uses a '\r' followed directly by a '\n'.
In a file:
"This is my UNIX file\nwhich spans two lines"
"This is my Windows file\r\nwhich spans two lines"
Of course, like all binary data, these characters are all about interpretation, and that interpretation depends on the application using the data. Stick to '\n' when you are making C-strings, unless you want a literal carriage-return, because as people have pointed out in the comments, the OS representation doesn't concern you. IO libraries, including C's, are supposed to handle this themselves and abstract it away from you.
For your curiosity, in decimal, '\n' in ASCII is 10, '\r' is 13, but note that this is the ASCII standard, not a C standard.
It depends:
'\n' is a single character (ASCII LF)
"\n" is a '\n' character followed by a 0 terminator
some I/O operations transform a '\n' into '\r\n' on some systems (CR-LF).
When you print the \n to a file, using the windows C stdio libraries, the library interprets that as a logical new-line, not the literal character 0x0A. The output to the file will be the windows version of a new-line: 0x0D0A (\r\n).
Writing
Sample code:
#include <stdio.h>
int main() {
FILE *f = fopen("foo.txt","w");
fprintf(f,"foo\nbar");
return 0;
}
A quick cl /EHsc foo.c later and you get
0x666F6F 0x0D0A 0x626172 (separated for convenience)
in foo.txt under a hex editor.
It's important to note that this translation DOES NOT occur if you are writing to a file in 'binary mode'.
Reading
If you are reading the file back in using the same tools, also on windows, the "windows EOL" will be interpreted properly if you try to match up against \n.
When reading it back
#include <stdio.h>
int main() {
FILE *f = fopen("foo.txt", "r");
char c;
while (EOF != fscanf(f, "%c", &c))
printf("%x-", c);
}
You get
66-6f-6f-a-62-61-72-
Therefore, the only time this should be relevant to you is if you are
Moving files back and forth between mac/unix and windows. Unix needs no real explanation here, since \n directly translates to 0x0A on those platforms. (pre-OSX \n was 0x0D on mac iirc)
Putting text in binary files, only do this carefully please
Trying to figure out why your binary data is being messed up when you opened the file "w", instead of "wb"
Estimating something important based on the size of the file, on windows you'll have an extra byte per newline.
\n is a new-line -- it's a logical representation of whatever separates one line from another in a text file.
A given platform will have some physical representation of that logical separation between lines. On Unix and most similar systems, the new-line is represented by a line-feed (LF) character (and since Unix was/is so closely associated with C, on Unix the LF is often just called a new-line). On MacOS, it's typically represented by a carriage-return (CR). On a fair number of other systems, most prominently Windows, it's represented by a carriage return/line feed pair -- normally in that order, though once in a while you see something use LF followed by CR (as I recall, Clarion used to do that).
In theory, a new-line doesn't need to correspond to any characters in the stream at all though. For example, a system could have text files that were stored as a length followed by the appropriate number of characters. In such a case, the run-time library would need to carry out a slightly more extensive translation between internal and external representations of text files than is now common, but such is life.
According to the C99 Standard (section 5.2.2),
\n "moves the active position [where the next character from fputc would appear] to the initial position on the next line".
Also
[\n] shall produce a unique implementation-defined value
which can be stored in a single char object. The external representations in a text file
need not be identical to the internal representations and are outside the scope of [the C99 Standard]
Most C implementations choose to define \n as ASCII line feed (0x0A) for historical reasons. However, on many computer operating systems, the sequence for moving the active position to the beginning of the next line requires two characters usually 0x0D, 0x0A. So, when writing to a text file, the C implementation must convert the internal sequence of 0x0A to the external one of 0x0D, 0x0A. How this is done is outside of the scope of the C standard, but usually, the file IO library will perform the conversion on any file opened in text mode.
Your question is about text files.
A text file is a sequence of lines.
A line is a sequence of characters ending in (and including) a line break.
A line breaks is represented differently by different Operating Systems.
On Unix/Linux/Mac they are usually represented by a single LINEFEED
On Windows they are usually represented by the pair CARRIAGE RETURN + LINEFEED
On old Macs they were usually represented by a single CARRIAGE RETURN
On other systems (AS/400 ??) there may even not be a specific character that represents a line break ...
Anyway, the library code in C is responsible to translating the system's line break to '\n' when reading text files and do the reverse operation when writing text files.
So, no matter what the representation is on any given system, when you read a text file in C, lines will be ended by a '\n'.
Note: The '\n' is not necessarily 0x0a in all systems.
Yes it is.
\n is a newline. Hex code is 0x0A.
\r is a carriage return. Hex code is 0x0D
It is a single character. It represents Newline (but is not the only representation - Wikipedia).
EDIT: The question was changed while I was typing the answer.

Resources