What is the role of '\n' in fprintf C? - c

I am confused about the role of '\n' in fprintf. I understand that it copies characters into an array and the \n character signals when to stop reading the character in the current buffer. But then, when does it know to make the system call write.
for example, fprintf(stdout,"hello") prints but I never gave the \n character so how does it know to make the system call.

The system call is made when the channel is synced/flushed. This can be when the buffer is full (try writing a LOT without a \n and you'll see output at some point), when a \n is seen (IF you have line buffering configured for the channel, which is the default for tty devices), or when a call to fflush() is made.
When the path is closed, it will be flushed as well. When the process terminates, the operating system will close any open paths it has. Each of these events will lead to the system call to emit the output happening.

First of all, fprintf("hello"); isn't correct, it should be for instance fprintf(stdout, "hello");. Next, \n doesn't implies to stop reading or writing characters, because \n is itself a character, the linefeed (10th in ascii table).

The assumptions you've stated about the purpose and use of \n are wrong. \n is simply one of 255 ASCII characters that are read, i.e. they are not commands, or directives, that cause anything to happen, they are just passive characters that when used in various C functions (eg. sprintf(), printf(), fprintf(), etc.) are interpreted such that presentation of output is manipulated to appear in the desired way.
By the way, \n (new line) is in good company with many other formatting codes which you can see represented HERE

Related

Guarantee that getchar receives newline or EOF (eventually)?

I would like to read characters from stdin until one of the following occurs:
an end-of-line marker is encountered (the normal case, in my thinking),
the EOF condition occurs, or
an error occurs.
How can I guarantee that one of the above events will happen eventually? In other words, how do I guarantee that getchar will eventually return either \n or EOF, provided that no error (in terms of ferror(stdin)) occurs?
// (How) can we guarantee that the LABEL'ed statement will be reached?
int done = 0;
while (!0) if (
(c = getchar()) == EOF || ferror(stdin) || c == '\n') break;
LABEL: done = !0;
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur. It seems like the answer will have to do with the properties of the device. Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition. Similarly for files stored on disc / SSD.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I am using C89, but I am curious if the answer depends on which C standard is used.
You can't.
Let's say I run your program, then I put a weight on my keyboard's "X" key and go on vacation to Hawaii. On the way there, I get struck by lightning and die.
There will never be any input other than 'x'.
Or, I may decide to type the complete story of Moby Dick, without pressing enter. It will probably take a few days. How long should your program wait before it decides that maybe I won't ever finish typing?
What do you want it to do?
Looking at all the discussion in the comments, it seems you are looking in the wrong place:
It is not a matter of keyboard drivers or wrapping stdin.
It is also not a matter of what programming language you are using.
It is a matter of the purpose of the input in your software.
Basically, it is up to you as a programmer to know how much input you want or need, and then decide when to stop reading input, even if valid input is still available.
Note, that not only are there devices that can send input forever without triggering EOF or end of line condition, but there are also programs that will happily read input forever.
This is by design.
Common examples can be found in POSIX style OS (like Linux) command line tools.
Here is a simple example:
cat /dev/urandom | hexdump
This will print random numbers for as long as your computer is running, or until you hit Ctrl+C
Though cat will stop working when there is nothing more to print (EOF or any read error), it does not expect such an end, so unless there is a bug in the implementation you are using it should happily run forever.
So the real question is:
When does your program need to stop reading characters and why?
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur.
A device such as /dev/zero, for example. Yes, stdin can be connected to a device that never provides a newline or reaches EOF, and that is not expected ever to report an error condition.
It seems like the answer will have to do with the properties of the device.
Indeed so.
Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
Generally, it's a question of the device driver. And in some cases (such as the /dev/zero example) that's all there is anyway. Generally drivers do things that are sensible for the underlying hardware, but in principle, they don't have to do.
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition.
No. Generally speaking, an end-of-line marker is sent by a terminal device if and only if the <enter> key is pressed. An end-of-file condition might be signaled if the terminal disconnects (but the program continues), or if the user explicitly causes one to be sent (by typing <-<D> on Linux or Mac, for example, or <-<Z> on Windows). Neither of those events need actually happen on any given run of a program, and it is very common for the latter not to do.
Similarly for files stored on disc / SSD.
You can generally rely on data read from an ordinary file to contain newlines where they are present in the file itself. If the file is open in text mode, then the system-specific text line terminator will also be translated to a newline, if it differs. It is not necessary for a file to contain any of those, so a program reading from a regular file might never see a newline.
You can rely on EOF being signaled when a read is attempted while the file position is at or past the and of the file's data.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I think you're trying too hard.
Reading to end-of-line might be a reasonable thing to do in some cases. Expecting a newline to eventually be reached is reasonable if the program is intended to support interactive use. But trying to ensure that invalid data cannot be fed to your program is a losing cause. Your objective should be to accept the widest array of inputs you reasonably can, and to fail gracefully when other inputs are presented.
If you need to read input in a line-by-line mode then by all means do that, and document that you do it. If only the first n characters of each line are significant to the program then document that, too. Then, if your program never terminates when a user connects its input to /dev/zero that's on them, not on you.
On the other hand, try to avoid placing arbitrary constraints, especially on sizes of things. If there is not a natural limit on the size of something, then no artificial limit you introduce will ever be enough.

C programming getch(), getchar(), getche() [duplicate]

What is the exact difference between the getch and getchar functions?
getchar() is a standard function that gets a character from the stdin.
getch() is non-standard. It gets a character from the keyboard (which may be different from stdin) and does not echo it.
The Standard C function is is getchar(), declared in <stdio.h>. It has existed basically since the dawn of time. It reads one character from standard input (stdin), which is typically the user's keyboard, unless it has been redirected (for example via the shell input redirection character <, or a pipe).
getch() and getche() are old MS-DOS functions, declared in <conio.h>, and still popular on Windows systems. They are not Standard C functions; they do not exist on all systems. getch reads one keystroke from the keyboard immediately, without waiting for the user to hit the Return key, and without echoing the keystroke. getche is the same, except that it does echo. As far as I know, getch and getche always read from the keyboard; they are not affected by input redirection.
The question naturally arises, if getchar is the standard function, how do you use it to read one character without waiting for the Return key, or without echoing? And the answers to those questions are at least a little bit complicated. (In fact, they're complicated enough that I suspect they explain the enduring popularity of getch and getche, which if nothing else are very easy to use.)
And the answer is that getchar has no control over details like echoing and input buffering -- as far as C is concerned, those are lower-level, system-dependent issues.
But it is useful to understand the basic input model which getchar assumes. Confusingly, there are typically two different levels of buffering.
As the user types keys on the keyboard, they are read by the operating system's terminal driver. Typically, in its default mode, the terminal driver echoes keystrokes immediately as they are typed (so the user can see what they are typing). Typically, in its default mode, the terminal driver also supports some amount of line editing -- for example, the user can hit the Delete or Backspace key to delete an accidentally-typed character. In order to support line editing, the terminal driver is typically collecting characters in an input buffer. Only when the user hits Return are the contents of that buffer made available to the calling program. (This level of buffering is present only if standard input is in fact a keyboard or other serial device. If standard input has been redirected to a file or pipe, the terminal driver is not in effect and this level of buffering does not apply.)
The stdio package reads characters from the operating system into its own input buffer. getchar simply fetches the next character from that buffer. When the buffer is empty, the stdio package attempts to refill it by reading more characters from the operating system.
So, if we trace what happens starting when a program calls getchar for the first time: stdio discovers that its input buffer is empty, so it tries to read some characters from the operating system, but there aren't any characters available yet, so the read call blocks. Meanwhile, the user may be typing some characters, which are accumulating in the terminal driver's input buffer, but the user hasn't hit Return yet. Finally, the user hits Return, and the blocked read call returns, returning a whole line's worth of characters to stdio, which uses them to fill its input buffer, out of which it then returns the first one to that initial call to getchar, which has been patiently waiting all this time. (And then if the program calls getchar a second or third time, there probably are some more characters -- the next characters on the line the user typed -- available in stdio's input buffer for getchar to return immediately. For a bit more on this, see section 6.2 of these C course notes.)
But in all of this, as you can see, getchar and the stdio package have no control over details like echoing or input line editing, because those are handled earlier, at a lower level, in the terminal driver, in step 1.
So, at least under Unix-like operating systems, if you want to read a character without waiting for the Return key, or control whether characters are echoed or not, you do that by adjusting the behavior of the terminal driver. The details vary, but there's a way to turn echo on and off, and a way (actually a couple of ways) to turn input line editing on and off. (For at least some of those details, see this SO question, or question 19.1 in the old C FAQ list.)
When input line editing is turned off, the operating system can return characters immediately (without waiting for the Return key), because in that case it doesn't have to worry that the user might have typed a wrong keystroke that needs to be "taken back" with the Delete or Backspace key. (But by the same token, when a program turns off input line editing in the terminal driver, if it wants to let the user correct mistakes, it must implement its own editing, because it is going to see --- that is, successive calls to getchar are going to return -- both the user's wrong character(s) and the character code for the Delete or Backspace key.)
getch() it just gets an input but never display that as an output on the screen despite of us pressing an enter key.
getchar() it gets an input and display it on the screen when we press the enter key.
getchar is standard C, found in stdio.h. It reads one character from stdin(the standard input stream = console input on most systems). It is a blocking call, since it requires the user to type a character then press enter. It echoes user input to the screen.
getc(stdin) is 100% equivalent to getchar, except it can also be use for other input streams.
getch is non-standard, typically found in the old obsolete MS DOS header conio.h. It works just like getchar except it isn't blocking after the first keystroke, it allows the program to continue without the user pressing enter. It does not echo input to the screen.
getche is the same as getch, also non-standard, but it echoes input to the screen.

Different behaviour of Ctrl-D (Unix) and Ctrl-Z (Windows)

As per title I am trying to understand the exact behavior of Ctrl+D / Ctrl+Z in a while loop with a gets (which I am required to use). The code I am testing is the following:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char str[80];
while(printf("Insert string: ") && gets(str) != NULL) {
puts(str);
}
return 0;
}
If my input is simply a Ctrl+D (or Ctrl+Z on Windows) gets returns NULL and the program exits correctly. The unclear situation is when I insert something like house^D^D (Unix) or house^Z^Z\n (Windows).
In the first case my interpretation is a getchar (or something similar inside the gets function) waits for read() to get the input, the first Ctrl+D flushes the buffer which is not empty (hence not EOF) then the second time read() is called EOF is triggered.
In the second case though, I noticed that the first Ctrl+Z is inserted into the buffer while everything that follows is simply ignored. Hence my understanding is the first read() call inserted house^Z and discarded everything else returning 5 (number of characters read). (I say 5 because otherwise I think a simple Ctrl+Z should return 1 without triggering EOF). Then the program waits for more input from the user, hence a second read() call.
I'd like to know what I get right and wrong of the way it works and which part of it is simply implementation dependent, if any.
Furthermore I noticed that in both Unix and Windows even after EOF is triggered it seem to reset to false in the following gets() call and I don't understand why this happens and in which line of the code.
I would really appreciate any kind of help.
(12/20/2016) I heavily edited my question in order to avoid confusion
The CTRL-D and CTRL-Z "end of file" indicators serve a similar purpose on Unix and Windows systems respectively, but are implemented quite differently.
On Unix systems (including Unix clones like Linux) CTRL-D, while officially described as the end-of-file character, is actually a delimiter character. It does almost the same thing as the end-of-line character (usually carriage return or CTRL-M) which is used to delimit lines. Both characters tell the operating system that the input line is finished and to make it available the program. The only difference is that with end-of-line character a line feed (CTRL-J) character is inserted at the end of the input buffer to mark the end of the line, while with the end-of-file character nothing is inserted.
This means when you enter house^D^D on Unix the read system call will first return a buffer of length 5 with the 5 characters house in it. When read is called again to obtain more input, it will then returns of a buffer of length 0 with no characters in it. Since a zero length read on a normal file indicates that the end of file has been reached the gets library function also interprets this as the end of file and stops reading the input. However since it filled the buffer with 5 characters it doesn't return NULL to indicate that it reached end of the file. And since it hasn't actually actually reached end of file, as terminal devices aren't actually files, further calls to gets after this will make further calls to read which will return any subsequent characters that the user types.
On Windows CTRL-Z is handled much differently. The biggest difference is that it's not treated specially by the operating system at all. When you type house^Z^Z^M on Windows only the carriage return character is given special treatment. Just like on Unix, the carriage return makes the typed line available to the program, though in this case a carriage return and a line feed are added to the buffer to mark the end of the line. So the result is that ReadFile function returns a 9 byte long buffer with the 9 characters house^Z^Z^M^J in it.
It actually the program itself, specifically the C runtime library, that treats CTRL-Z specially. In the case of the Microsoft C runtime library when it sees the CTRL-Z character in the buffer returned by ReadFile it treats it as an end-of-file marker and ignores everything else after it. Using the example in the previous paragraph, gets ends up calling ReadFile to get more input because the fact its seen the CTRL-Z character isn't remembered when reading from the console (or other device) and it hasn't yet seen the end-of-line (which was ignored). If you then press enter again, gets will return with the buffer filled with the 7 bytes house^Z\0 (adding a 0 byte to indicate the end of the string). By default, it does the much same thing when reading from normal files, if a CTRL-Z character appears in a file, it and everything after it is ignored. This is for backward-compatibility with CP/M which only supported files in lengths that were multiples of 128 and used CTRL-Z to mark where text files really were supposed to end.
Note that both the Unix and Windows behaviours described above are only the normal default handling of user input. The Unix handling of CTRL-D only occurs when reading from a terminal device in canonical mode and it's possible to change the "end-of-file" character to something else. On Windows the operating system never treats CTRL-Z specially, but whether the C runtime library does or not depends on whether the FILE stream being read is in text or binary mode. This is why in portable programs you should always include the character b in the mode string when opening binary files (eg. fopen("foo.gif", "rb")).

Is this the proper way to flush the C input stream?

Well I been doing a lot of searching on google and on here about how to flush the input stream properly. All I hear is mixed arguments about how fflush() is undefined for the input stream, and some say just do it that way, and others just say don't do it, I haven't had much luck on finding a clear efficient/proper way of doing so, that the majority of people agree on.. I am quite new at programming so I don't know all the syntax/tricks of the language yet, so my question which way is the most efficient/proper solution to clearing the C input stream??
Use the getchar() twice before I try to receive more input?
Just use the fflush() function on the input? or
This is how I thought I should do it.
void clearInputBuf(void);
void clearInputBuf(void)
{
int garbageCollector;
while ((garbageCollector = getchar()) != '\n' && garbageCollector != EOF)
{}
}
So, whenever I need to read a new scanf(), or use getchar() to pause the program I just call the clearInputBuf.. So what would be the best way out of the three solutions or is there a even better option?
All I hear is mixed arguments about how fflush() is undefined for the input stream
That's correct. Don't use fflush() to flush an input stream.
(A) will work for simple cases where you leave single character in the input stream (such as scanf() leaving a newline).
(B) Don't use. It's defined on some platforms. But don't rely on what standard calls undefined behaviour.
(C) Clearly the best out of the 3 options as it can "flush" any number of characters in the input stream.
But if you read lines (such as using fgets()), you'll probably have much less need to clear input streams.
It depends on what you think of as "flushing an input stream".
For output streams, the flush operation makes sure that all data that were written to the stream but were being kept buffered in memory have been flushed to the underlying filesystem file. That's a very well defined operation.
For input streams, there is no well defined operation of what flushing the stream should do. I would say that it does not make any sense.
Some implementations of the C standard library redefine the meaning of "flush" for input streams to mean "clear the rest of the current line which has not been read yet". But that's entirely arbitrary, and other implementations choose to do nothing instead.
As of C11 this disparity has been corrected, and the standard now explicitly states that the fflush() function does not work with input streams, precisely because it makes no sense, and we do not want each runtime library vendor to go implementing it in whatever way they feel like.
So, please by all means do go ahead and implement your clearInputBuf() function the way you did, but do not think of it as "flushing the input stream". There is no such thing.
It turns out to be platform dependent.
The fflush() cannot have an input stream as a parameter because according to the c standard, IT'S UNDEFINED BEHAVIOR since the behavior is not defined anywhere.
On Windows, there is a defined behavior for fflush() and it does what you need it to do.
On Linux, there is fpurge(3) which does what you want it to do too.
The best way is to simply read all characters in a loop until
A newline character is found.
EOF is returned from getchar().
like your clearInputBuf() function.
Note that flushing an output stream means writing all unwritten data to the stream, the data that is still in a buffer waiting to be flushed. But reading all the unread bytes from a stream, does not have the same meaning.
That's why it doesn't make sense to fflush() an input stream. On the other hand fpurge() is designed specifically for this, and it's name is a better choice because you want to clear the input stream and start fresh. The problem is, it's not a standard function.
Reading fpurge(3) should clarify why fflush(stdin) is undefined behavior, and why an implementation like the one on Windows doesn't make sense because it makes fflush() behave differently with different inputs. That's like making c compliant with PHP.
The problem is more subtile than it looks:
On systems with elaborate terminal devices, such as unix and OS/X, input from the terminal is buffered at 2 separate levels: the system terminal uses a buffer to handle line editing, from just correcting input with backspace to full line editing with cursor and control keys. This is called cooked mode. A full line of input is buffered in the system until the enter key is typed or the end-of-file key combination is entered.
The FILE functions perform their own buffering, which is line buffered by default for streams associated with a terminal. The buffer size in set to BUFSIZ by default and bytes are requested from the system when the buffered contents have been consumed. For most requests, a full line will be read from the system into the stream buffer, but in some cases such as when the buffer is full, only part of the line will have been read from the system when scanf() returns. This is why discarding the contents of the stream buffer might not always suffice.
Flushing the input buffer may mean different things:
discarding extra input, including the newline character, that have been entered by the user in response to input requests such as getchar(), fgets() or scanf(). The need for flushing this input is especially obvious in the case of scanf() because most format lines will not cause the newline to be consumed.
discarding any pending input and waiting for the user to hit a key.
You can implement a fluch function portably for the first case:
int flush_stream(FILE *fp) {
int c;
while ((c = getc(fp)) != EOF && c != '\n')
continue;
return c;
}
And this is exactly what your clearInputBuf() function does for stdin.
For the second case, there is no portable solution, and system specific methods are non trivial.

What is the difference between getch() and getchar()?

What is the exact difference between the getch and getchar functions?
getchar() is a standard function that gets a character from the stdin.
getch() is non-standard. It gets a character from the keyboard (which may be different from stdin) and does not echo it.
The Standard C function is is getchar(), declared in <stdio.h>. It has existed basically since the dawn of time. It reads one character from standard input (stdin), which is typically the user's keyboard, unless it has been redirected (for example via the shell input redirection character <, or a pipe).
getch() and getche() are old MS-DOS functions, declared in <conio.h>, and still popular on Windows systems. They are not Standard C functions; they do not exist on all systems. getch reads one keystroke from the keyboard immediately, without waiting for the user to hit the Return key, and without echoing the keystroke. getche is the same, except that it does echo. As far as I know, getch and getche always read from the keyboard; they are not affected by input redirection.
The question naturally arises, if getchar is the standard function, how do you use it to read one character without waiting for the Return key, or without echoing? And the answers to those questions are at least a little bit complicated. (In fact, they're complicated enough that I suspect they explain the enduring popularity of getch and getche, which if nothing else are very easy to use.)
And the answer is that getchar has no control over details like echoing and input buffering -- as far as C is concerned, those are lower-level, system-dependent issues.
But it is useful to understand the basic input model which getchar assumes. Confusingly, there are typically two different levels of buffering.
As the user types keys on the keyboard, they are read by the operating system's terminal driver. Typically, in its default mode, the terminal driver echoes keystrokes immediately as they are typed (so the user can see what they are typing). Typically, in its default mode, the terminal driver also supports some amount of line editing -- for example, the user can hit the Delete or Backspace key to delete an accidentally-typed character. In order to support line editing, the terminal driver is typically collecting characters in an input buffer. Only when the user hits Return are the contents of that buffer made available to the calling program. (This level of buffering is present only if standard input is in fact a keyboard or other serial device. If standard input has been redirected to a file or pipe, the terminal driver is not in effect and this level of buffering does not apply.)
The stdio package reads characters from the operating system into its own input buffer. getchar simply fetches the next character from that buffer. When the buffer is empty, the stdio package attempts to refill it by reading more characters from the operating system.
So, if we trace what happens starting when a program calls getchar for the first time: stdio discovers that its input buffer is empty, so it tries to read some characters from the operating system, but there aren't any characters available yet, so the read call blocks. Meanwhile, the user may be typing some characters, which are accumulating in the terminal driver's input buffer, but the user hasn't hit Return yet. Finally, the user hits Return, and the blocked read call returns, returning a whole line's worth of characters to stdio, which uses them to fill its input buffer, out of which it then returns the first one to that initial call to getchar, which has been patiently waiting all this time. (And then if the program calls getchar a second or third time, there probably are some more characters -- the next characters on the line the user typed -- available in stdio's input buffer for getchar to return immediately. For a bit more on this, see section 6.2 of these C course notes.)
But in all of this, as you can see, getchar and the stdio package have no control over details like echoing or input line editing, because those are handled earlier, at a lower level, in the terminal driver, in step 1.
So, at least under Unix-like operating systems, if you want to read a character without waiting for the Return key, or control whether characters are echoed or not, you do that by adjusting the behavior of the terminal driver. The details vary, but there's a way to turn echo on and off, and a way (actually a couple of ways) to turn input line editing on and off. (For at least some of those details, see this SO question, or question 19.1 in the old C FAQ list.)
When input line editing is turned off, the operating system can return characters immediately (without waiting for the Return key), because in that case it doesn't have to worry that the user might have typed a wrong keystroke that needs to be "taken back" with the Delete or Backspace key. (But by the same token, when a program turns off input line editing in the terminal driver, if it wants to let the user correct mistakes, it must implement its own editing, because it is going to see --- that is, successive calls to getchar are going to return -- both the user's wrong character(s) and the character code for the Delete or Backspace key.)
getch() it just gets an input but never display that as an output on the screen despite of us pressing an enter key.
getchar() it gets an input and display it on the screen when we press the enter key.
getchar is standard C, found in stdio.h. It reads one character from stdin(the standard input stream = console input on most systems). It is a blocking call, since it requires the user to type a character then press enter. It echoes user input to the screen.
getc(stdin) is 100% equivalent to getchar, except it can also be use for other input streams.
getch is non-standard, typically found in the old obsolete MS DOS header conio.h. It works just like getchar except it isn't blocking after the first keystroke, it allows the program to continue without the user pressing enter. It does not echo input to the screen.
getche is the same as getch, also non-standard, but it echoes input to the screen.

Resources