Understanding the hardware of printf - c

I was wondering if there was any resources available online that explains what happens with something, like printf of C, that explains what's going on in the very low level (BIOS/kernel calls)

Linux:
printf() ---> printf() in the C library ---> write() in C library ---> write() system call in kernel.
To understand the interface between user space and kernel space, you will need to have some knowledge of how system calls work.
To understand what is going on at the lowest levels, you will need to analyze the source code in the kernel.
The Linux system call quick reference (pdf link) may be useful as it identifies where in the kernel source you might begin looking.

Something like printf, or printf specifically? That is somewhat vague.
printf outputs to the stdout FILE* stream; what that is associated with is system dependent and can moreover be redirected to any other stream device for which the OS provides a suitable device driver. I work in embedded systems, and most often stdout is by default directed to a UART for serial I/O - often that is the only stream I/O device supported, and cannot be redirected. In a GUI OS for console mode applications, the output is 'drawn' graphically in the system defined terminal font to a window, in Windows for example this may involve GDI or DirectDraw calls, which in turn access the video hardware's device driver. On a modern desktop OS, console character output does not involve the BIOS at all other than perhaps initial bootstrapping.
So in short, there typically is a huge amount of software between a printf() call and the hardware upon which it is output.

This is very platform-specific. From a hardware perspective, the back-end implementation of printf() could be directed to a serial port, a non-serial LCD, etc. You're really asking two questions:
How does printf() interpret arguments and format string to generate correct output?
How does output get from printf() to your target device?
You must remember that an OS, kernel, and BIOS are not required for an application to function. Embedded apps typically have printf() and other IO routines write to a character ring buffer. An interrupt may then poll that buffer and manipulate output hardware (LCD, serial port, laser show, etc) to send the buffered output to the correct destination.

By definition, BIOS and kernel calls are platform-specific. What platform are you interested in? Several links to Linux-related information have already been posted.
Also note that printf may not even result in any BIOS or kernel calls, as your platform may not have a kernel or BIOS present (embedded systems are a good example of this).

The printf() takes multiple arguments (variable length arguments function). The user supplies a string and input arguments.
The printf() function creates an internal buffer for constructing output string.
Now, printf() iterates through each character of user string and copies the character to the output string. Printf() only stops at "%".
"%" means there is an argument to convert(Arguments are in the form of char, int, long, float, double or string). It converts it to string and appends to the output buffer. If the argument is a string then it does a string copy.
Finally, printf() may reach at the end of user sting and it copies the entire buffer to the stdout file.

Related

How does the standard library conform to the text stream model?

My question is regarding the following paragraph on page 15 (Section 1.5) of The ANSI C Programming Language (2e) by Kernighan and Ritchie (emphasis added):
The model of input and output supported by the standard library is very simple.
Text input or output, regardless of where it originates or where it goes to,
is dealt as a stream of characters. A text stream is a sequence of characters divided
into lines; each line consists of zero or more characters followed by a newline character.
It is the responsibility of the library to make each input or output stream conform to
this model; the C programmer using the library need not worry about how lines are
represented outside the program.
I'm unsure of what is meant by the text in bold, especially the line "it is the responsibility of the library to make each input or ouptput stream conform to this model." Could someone please help me understand what this means?
At first, I thought it had something to do with the line-buffering of stdin I was seeing when I call getchar() when stdin is empty, but then learned that the buffering mode varies across implementations (see here). So I don't think this is what the text in bold is referring to when it talks about conforming to the text stream model.
Consider running code like printf("hello world"); in the firmware of a USB device. Suppose that whatever characters you pass to printf are sent over USB from the device to the computer. The way the USB protocol works, the characters must be split up into groups of characters called packets. There is a maximum packet size depending on how your USB hardware and descriptors are configured. Also, for efficiency, you want to fill up the packets whenever possible, because sending a packet that is less than the maximum size means the computer will stop letting you send more data for a while. Also, if the computer doesn't receive your packet, you might need to re-send it. Also, if your USB packet buffers are already filled, you might need to wait a while until one of them gets sent.
To make programming in C a manageable task, the implementation of printf needs to handle all of these details so the user doesn't need to worry about them when they are calling printf. For example, it would be really bad if printf was only able to send a single packet of 1 to 8 bytes whenever you call it, and thus it returns an error whenever you give it more than 8 characters.
This is called an abstraction: the underlying system has some complexity (like USB endpoints, packets, buffers, retries). You don't want to think about that stuff all the time so you make a library that transforms that stuff into a more abstract interface (like a stream of characters). Or you just use a "standard library" written by someone else that takes care of that for you.
If you want a more PC-centric example... I believe that printf is implemented on many systems by calling the write system call. Since write isn't always guaranteed to actually write all of the data you give it, the implementation of printf needs to try multiple times to write the data you give it. Also, for efficiency, the printf implementation might buffer the data you give it in RAM for a while before passing it to the kernel with write. You don't generally have to worry about retrying or buffering details while programming in C because once your program terminates or you flush the buffer, the standard library makes sure all your data has been written.

Can you get input from devices aside from the keyboard in C's standard library?

I was reading a book from 1997 that teaches how to program in C, and it always uses the word “usually” when specifying that functions like scanf take input from the keyboard. Because of this, I'm curious as to if functions like scanf can take input from other devices, or if it used to.
Because of this, I'm curious as to if functions like scanf can take input from other devices, or if it used to.
scanf takes input from the program's standard input. What this is connected to is a matter of the operating environment and the way the program is launched. (Look up "I/O redirection"). It is not unusual for a program's standard input to be connected to a file on disk or to the output of another program. It sometimes is connected to a socket. More rarely, it is connected to a serial port, or to a null device, or a source of zeroes or random bytes.
Historically, it might have been connected to a card or paper tape reader.
In principle, it can be connected to any device that produces data -- a mouse, for example -- but just because something is possible doesn't make it useful.
After freopen( ..., ...., stdin), scanf() input can come from many possible sources.

What is stdin in C language?

I want to build my own scanf function. Basic idea is data from a memory address and save it to another memory address.
What is stdin? Is it a memory-address like 000ffaa?
If it is a memory-address what is it so I can build my own scanf function. Thanks!.
No, stdin is not "a memory address".
It's an I/O stream, basically an operating-system level abstraction that allows data to be read (or written, in the case of stdout).
You need to use the proper stream-oriented I/O functions to read from the stream.
Of course you can read from RAM too, so it's best to write your own function to require a function that reads a character, then you can adapt that function to either read from RAM or from stdin.
Something like:
int my_scanf(int (*getchar_callback)(void *state), void *state, const char *fmt, ...);
Is usually reasonable. The state pointer is some user-defined state that is required by the getchar_callback() function, and passed to it by my_scanf().
stdin is an "input stream", which is an abstract term for something that takes input from the user or from a file. It is an abstraction layer sitting on top of the actual file handling and I/O. The purpose of streams is mainly to make your code portable between different systems.
Reading/writing to memory is much more low-level and has nothing to do with streams as such. In order to use a stream in a meaningful way, you would have to know how a certain compiler implements the stream internally, which may not be public information. In some cases, like in Windows, streams are defined by the OS itself and can get accessed through API calls.
If you are looking to build your own scanf function, you would have to look into specific API functions for a specific OS, then build your own abstraction layer on top of those.
On Unix everything is a file
https://en.wikipedia.org/wiki/Everything_is_a_file
Or like they notice
Everything is a file descriptor
You can find on unix system /dev/stdin who is a symbolic link to /dev/fd/0 who is a Character special file

C: How Efficient Are Output Routines in Terms of Buffering?

I can't find any information on whether buffering is already implicitly done out of the box when one is writing a file with either fprintf or fwrite. I understand that this might be implementation/platform dependent feature. What I'm interested in, is whether I can at least expect it to be implemented efficiently on modern popular platforms such as Windows, Linux, or Mac OS X?
AFAIK, usually buffering for I/O routines is done on 2 levels:
Library level: this could be C standard library, or Java SDK (BufferedOutputStream), etc.;
OS level: modern platforms extensively cache/buffer I/O operations.
My question is about #1, not #2 (as I know it's already true). In other words, can I expect C standard library implementations for all modern platforms to take advantage of buffering?
If not, then is manually creating a buffer (with cleverly chosen size) and flushing it on overflow a good solution to the problem?
Conclusion
Thanks to everyone who pointed out functions like setbuf and setvbuf. These are the exact evidence that I was looking for to answer my question. Useful extract:
All files are opened with a default allocated buffer (fully buffered)
if they are known to not refer to an interactive device. This function
can be used to either set a specific memory block to be used as buffer
or to disable buffering for the stream.
The default streams stdin and stdout are fully buffered by default if
they are known to not refer to an interactive device. Otherwise, they
may either be line buffered or unbuffered by default, depending on the
system and library implementation. The same is true for stderr, which
is always either line buffered or unbuffered by default.
In most cases buffering for stdio routines is tuned to be consistent with typical block size of the operating system in question. This is done to optimize the number of I/O operations in the default case. Of course you can always change it with setbuf()/setvbuf() routines.
Unless you are doing something special, you should stick to the default buffering as you can be quite sure it's mostly optimal on your OS (for the typical scenario).
The only case that justifies it is when you want to use stdio library to interact with I/O channels that are not geared towards it, in which case you might want to disable buffering altogether. But I don't get to see cases for this too often.
You can safely assume that standard I/O is sensibly buffered on any modern system.
As #David said, you can expect sensible buffering (at both levels).
However, there can be a huge difference between fprintf and fwrite, because fprintf interprets a format string.
If you stack-sample it, you can find a significant percent of time converting doubles into character strings, and stuff like that.
The C IO library allows to control the way buffering is done (inside the application, before what the OS does) with setvbuf. If you don't specify anything, the standard requires that "when opened, a stream is fully buffered if and only if it can be determined not to
refer to an interactive device.", the requirement also holds for stdin and stdout while stderr is not buffered even if one could detect that it is directed to a non interactive device.

How does scanf() work inside the OS?

I've been wondering how scanf()/printf() actually works in the hardware and OS levels. Where does the data flow and what exactly is the OS doing around these times? What calls does the OS make? And so on...
scanf() and printf() are functions in libc (the C standard library), and they call the read() and write() operating system syscalls respectively, talking to the file descriptors stdin and stdout respectively (fscanf and fprintf allow you to specify the file stream you want to read/write from).
Calls to read() and write() (and all syscalls) result in a 'context switch' out of your user-level application into kernel mode, which means it can perform privileged operations, such as talking directly to hardware. Depending on how you started the application, the 'stdin' and 'stdout' file descriptors are probably bound to a console device (such as tty0), or some sort of virtual console device (like that exposed by an xterm). read() and write() safely copy the data to/from a kernel buffer called a 'uio'.
The format-string conversion part of scanf and printf does not occur in kernel mode, but just in ordinary user mode (inside 'libc'), the general rule of thumb with syscalls is you switch to kernel mode as infrequently as possible, both to avoid the performance overhead of context switching, and for security (you need to be very careful about anything that happens in kernel mode! less code in kernel mode means less bugs/security holes in the operating system).
btw.. all of this was written from a unix perspective, I don't know how MS Windows works.
On my OS I am working with scanf and printf are based on functions getch() ant putch().
I think the OS just provides two streams, one for input and the other for output, the streams abstract away how the output data gets presented or where the input data comes from.
so what scanf & printf are doing are just adding bytes (or consuming bytes) from either streams.
scanf , printf etc internally all these types of functions can't be directly written in c/c++ language. internally they all are written in assembly language by the use of keword "asm", any thing written with keyword "asm" are directly introduced to object file irrespective of compilation (not changed even after compilation), and in assembly language we have got predefined codes which can implement all these functions ...... so in short SCANF PRINTF etc ALL ARE WRITTEN IN ASSEMBLY LANGUAGE INTERNALLY. YOU CAN DESIGN YOUR OWN INPUT FUNCTION USING KEYWORD "ASM".

Resources