What's really happening inside getchar() and printf()? Explain whole process - c

I was learning from geeksforgeeks and when i saw this, a thought came in my mind that as getchar() and other similar functions returns an int(whether due to failure of program or EOF) then why format specifier used is %c, why not %d(or %i).
// Example for getchar() in C
#include <stdio.h>
int main()
{
printf("%c", getchar());
return 0;
}
I know that the character we input in is a character and it is taken by the function getchar() and we are displaying it using %c.
But my problem is what's actually happening inside getchar() and printf() during the whole process, where getchar() is returning that integer, where it is getting stored and how the character we inputted is getting displayed by printf() i.e.what's happening inside printf()?
I did some research on printf() implementation and get to know that printf is part of the C standard library (a.k.a. the libc) and is a variadic function(printf) but i didn't came to know that what is really happening inside this function and how it knows by format specifier that it has to print character or int?
Please help me learning the whole detailed process which is going on.

The man page for getchar says the following:
fgetc() reads the next character from stream and returns it as an
unsigned char cast to an int, or EOF on end of file or error.
getc() is equivalent to fgetc() except that it may be implemented as a
macro which evaluates stream more than once.
getchar() is equivalent to getc(stdin).
So let's say you enter the character A. Assuming your system uses ASCII representation of characters, this character has an ASCII value of 65. So in this case getchar returns 65 as an int.
This int value of 65 returned by getchar is then passed to printf as the second argument. The printf function first looks at the format string and sees the %c format specifier. The man page for printf says the following regarding %c:
If no l modifier is present, the int argument is converted to an
unsigned char, and the resulting character is written.
So printf reads the next argument as an int. Since we passed in a int with value 65, that's what it reads. That value is then cast to an unsigned char. Since it is still in the rage of that type, the value is still 65. printf then prints the character for that value. And since 65 is the ASCII value for the character A, the character A is what appears.

To understand you must read default argument promotions.
where getchar() is returning that integer, where it is getting stored
C handle this for you.
how the character we inputted is getting displayed by printf()
%c tell to printf() to print a character.

(I am supposing your computer has an x86-64 processor and runs Linux)
why format specifier used is %c, why not %d(or %i).
Imagine that the corresponding argument (to printf) was 99 (an int). If you use  %c then the letter c (of ASCII code 99) is displayed. If you use %d or %i then 99 is displayed by printf, etc...
printf is, as you noticed, a variadic function. It is implemented using variadic primitives like va_start and va_end which are macros expanded to some builtin known to the compiler. How exactly arguments are passed and results are given (the calling convention) is defined (in some processor & OS specific way) in a document called ABI (application binary interface).
On some C standard library implementations, printf (and related functions, like vfprintf) would ultimately use putc or something related.
Notice that standard I/O functions (those in <stdio.h>) are likely to be provided with the help of some operating system. Read Operating Systems : Three Easy Pieces for more about OSes.
Quite often, the C standard library will use some system calls to interact with the operating system kernel. For Linux these are listed in syscalls(2), but read Advanced Linux Programming. To output some data the write(2) syscall would be used (but the C standard library is generally buffering, see setvbuf(3)).
BTW, for Linux/x86-64 both GNU glibc & musl-libc are free software implementations of the C standard library, and you can study their source code (most of it is coded in C, with a tiny bit of assembly for the system call glue).
But my problem is what's actually happening inside getchar() and printf() during the whole process, where getchar() is returning that integer, where it is getting stored ...?
The ABI defines that the result of an int returning function goes thru register %rax, and getchar (like every other int return function) works that way. See the X86-64 Linux ABI referenced here.
... and how the character we inputted is getting displayed by printf() i.e. what's happening inside printf()?
After many software layers, when the stdout stream gets flushed (e.g. by some call to fflush(3), by a \n newline character, or at exit(3) time, including returning from main into crt0 code), the C standard library will use the write(2) syscall. The kernel will process it to show something (But details are horribly complex, read first the tty demystified). Actually millions of source code lines are involved (including inside the kernel - read about DRM, inside the display server such as X.Org or Wayland - also some code inside the GPU -, inside the terminal emulator). Linux is free software, so in principle you can study all of it (but that needs more than a lifetime, a typical Linux distribution has about twenty billions lines of source code). See also OSDev wiki which gives some practical information, including about native Intel grapĥics (which are the most primitive graphics today).
PS. You need to spend more than ten years understanding all the details (and I don't).

Related

What's the difference between putch() and putchar()?

Okay so, I'm pretty new to C.
I've been trying to figure out what exactly is the difference between putch() and putchar()?
I tried googling my answers but all I got was the same copy-pasted-like message that said:
putchar(): This function is used to print one character on the screen, and this may be any character from C character set (i.e it may be printable or non printable characters).
putch(): The putch() function is used to display all alphanumeric characters through the standard output device like monitor. this function display single character at a time.
As English isn't my first language I'm kinda lost. Are there non printable characters in C? If so, what are they? And why can't putch produce the same results?
I've tried googling the C character set and all of the alphanumeric characters there are, but as much as my testing went, there wasn't really anything that one function could print and the other couldn't.
Anyways, I'm kind of lost.
Could anyone help me out? thanks!
TLDR;
what can putchar() do that putch() can't? (or the opposite or something idk)
dunno, hoped there would be a visible difference between the two but can't seem to find it.
putchar() is a standard function, possibly implemented as a macro, defined in <stdio.h> to output a single byte to the standard output stream (stdout).
putch() is a non standard function available on some legacy systems, originally implemented on MS/DOS as a way to output characters to the screen directly, bypassing the standard streams buffering scheme. This function is mostly obsolete, don't use it.
Output via stdout to a terminal is line buffered by default on most systems, so as long as your output ends with a newline, it will appear on the screen without further delay. If you need output to be flushed in the absence of a newline, use fflush(stdout) to force the stream buffered contents to be written to the terminal.
putch replaces the newline character (\ n)
putchar is a function in the C programming language that writes a single character to the standard output

Will printf still have a cost even if I redirect output to /dev/null?

We have a daemon that contains a lot of print messages. Since we are working on an embedded device with a weak CPU and other constraint hardware, we want to minimize any kinds of costs (IO, CPU, etc..) of printf messages in our final version. (Users don't have a console)
My teammate and I have a disagreement. He thinks we can just redirect everything to /dev/null. It won't cost any IO so affections will be minimal. But I think it will still cost CPU and we better define a macro for printf so we can rewrite "printf" (maybe just return).
So I need some opinions about who is right. Will Linux be smart enough to optimize printf? I really doubt it.
Pretty much.
When you redirect the stdout of the program to /dev/null, any call to printf(3) will still evaluate all the arguments, and the string formatting process will still take place before calling write(2), which writes the full formatted string to the standard output of the process. It's at the kernel level that the data isn't written to disk, but discarded by the handler associated with the special device /dev/null.
So at the very best, you won't bypass or evade the overhead of evaluating the arguments and passing them to printf, the string formatting job behind printf, and at least one system call to actually write the data, just by redirecting stdout to /dev/null. Well, that's a true difference on Linux. The implementation just returns the number of bytes you wanted to write (specified by the 3rd argument of your call to write(2)) and ignores everything else (see this answer). Depending on the amount of data you're writing, and the speed of the target device (disk or terminal), the difference in performance may vary a lot. On embedded systems, generally speaking, cutting off the disk write by redirecting to /dev/null can save quite some system resources for a non-trivial amount of written data.
Although in theory, the program could detect /dev/null and perform some optimizations within the restrictions of standards they comply to (ISO C and POSIX), based on general understanding of common implementations, they practically don't (i.e. I am unaware of any Unix or Linux system doing so).
The POSIX standard mandates writing to the standard output for any call to printf(3), so it's not standard-conforming to suppress the call to write(2) depending on the associated file descriptors. For more details about POSIX requirements, you can read Damon's answer. Oh, and a quick note: All Linux distros are practically POSIX-compliant, despite not being certified to be so.
Be aware that if you replace printf completely, some side effects may go wrong, for example printf("%d%n", a++, &b). If you really need to suppress the output depending on the program execution environment, consider setting a global flag and wrap up printf to check the flag before printing — it isn't going to slow down the program to an extent where the performance loss is visible, as a single condition check is much faster than calling printf and doing all the string formatting.
The printf function will write to stdout. It is not conforming to optimize for /dev/null.
Therefore, you will have the overhead of parsing the format string and evaluating any necessary arguments, and you will have at least one syscall, plus you will copy a buffer to kernel address space (which, compared to the cost of the syscall is neglegible).
This answer is based on the specific documentation of POSIX.
System Interfaces
dprintf, fprintf, printf, snprintf, sprintf - print formatted output
The fprintf() function shall place output on the named output stream. The printf() function shall place output on the standard output stream stdout. The sprintf() function shall place output followed by the null byte, '\0', in consecutive bytes starting at *s; it is the user's responsibility to ensure that enough space is available.
Base Definitions
shall
For an implementation that conforms to POSIX.1-2017, describes a feature or behavior that is mandatory. An application can rely on the existence of the feature or behavior.
The printf function writes to stdout. If the file descriptor connected to stdout is redirected to /dev/null then no output will be written anywhere (but it will still be written), but the call to printf itself and the formatting it does will still happen.
Write your own that wraps printf() using the printf() source as a guideline, and returning immediately if a noprint flag is set. The downside of this is when actually printing it will consume more resources because of having to parse the format string twice. But it uses negligible resources when not printing. Can't simply replace printf() because the underlying calls inside printf() can change with a newer version of the stdio library.
void printf2(const char *formatstring, ...);
Generally speaking, an implementation is permitted to perform such optimisations if they do not affect the observable (functional) outputs of the program. In the case of printf(), that would mean that if the program doesn't use the return value, and if there are no %n conversions, then the implementation would be allowed to do nothing.
In practice, I'm not aware of any implementation on Linux that currently (early 2019) performs such an optimisation - the compilers and libraries I'm familiar with will format the output and write the result to the null device, relying on the kernel' to ignore it.
You may want to write a forwarding function of your own if you really need to save the cost of formatting when the output is not used - you'll want to it to return void, and you should check the format string for %n. (You could use snprintf with a NULL and 0 buffer if you need those side-effects, but the savings are unlikely to repay the effort invested).
in C writing 0; does execute and nothing, which is similar to ;.
means you can write a macro like
#if DEBUG
#define devlognix(frmt,...) fprintf(stderr,(frmt).UTF8String,##__VA_ARGS__)
#else
#define nadanix 0
#define devlognix(frmt,...) nadanix
#endif
#define XYZKitLogError(frmt, ...) devlognix(frmt)
where XYZKitLogError would be your Log command.
or even
#define nadanix ;
which will kick out all log calls at compile time and replace with 0; or ; so it gets parsed out.
you will get Unused variable warnings, but it does what you want and this side effect can even be helpful because it tells you about computation that is not needed in your release.
.UTF8String is an Objective-C method converting NSStrings to const char* - you don't need.

kprintf printing out block letters

In my C program in an operating systems code (on the kernal side), I am trying to use kprintf to print a character, but when even I do, it prints it as well as some block character which has these four small circles in it.
kprintf(&ch);
Does anyone know whats going on here?
The printf() family of functions take a format string which tells what you want to print. You cannot print a character directly as you are doing, because printf() (or kprintf() as the case may be) will continue to read as if it were a string. You want something like:
kprintf("%c", ch);
The format string tells printf() what additional arguments to expect. In this case, %c indicates a character argument.

scanf Cppcheck warning

Cppcheck shows the following warning for scanf:
Message: scanf without field width limits can crash with huge input data. To fix this error message add a field width specifier:
%s => %20s
%i => %3i
Sample program that can crash:
#include
int main()
{
int a;
scanf("%i", &a);
return 0;
}
To make it crash:
perl -e 'print "5"x2100000' | ./a.out
I cannot crash this program typing "huge input data". What exactly should I type to get this crash? I also don't understand the meaning of the last line in this warning:
perl -e ...
The last line is an example command to run to demonstrate the crash with the sample program. It essentially causes perl to print 2.100.000 times "5" and then pass this to the stdin of the program "a.out" (which is meant to be the compiled sample program).
First of all, scanf() should be used for testing only, not in real world programs due to several issues it won't handle gracefully (e.g. asking for "%i" but user inputs "12345abc" (the "abc" will stay in stdin and might cause following inputs to be filled without a chance for the user to change them).
Regarding this issue: scanf() will know it should read a integer value, however it won't know how long it can be. The pointer could point to a 16 bit integer, 32 bit integer, or a 64 bit integer or something even bigger (which it isn't aware off). Functions with a variable number of arguments (defined with ...) don't know the exact datatype of elements passed, so it has to rely on the format string (reason for the format tags to not be optional like in C# where you just number them, e.g. "{0} {1} {2}"). And without a given length it has to assume some length which might be platform dependant as well (making the function even more unsave to use).
In general, consider it possibly harmful and a starting point for buffer overflow attacks. If you'd like to secure and optimize your program, start by replacing it with alternatives.
I tried running the perl expression against the C program and it did crash here on Linux (segmentation fault).
Using of 'scanf' (or fscanf and sscanf) function in real-world applications usually is not recommended at all because it's not safe and it's usually a hole for buffer overrun if some incorrect input data will be supplied.
There are much more secure ways to input numbers in many commonly used libraries for C++ (QT, runtime libraries for Microsoft Visual C++ etc.). Probably you can find secure alternatives for "pure" C language too.

will 'printf' always do its job?

printf("/*something else*/"); /*note that:without using \n in printf*/
I know printf() uses a buffer which prints whatever it contains when, in the line buffer, "\n" is seen by the buffer function. So when we forget to use "\n" in printf(), rarely, line buffer will not be emptied. Therefore, printf() wont do its job. Am I wrong?
The example you gave above is safe as there are no variable arguments to printf. However it is possible to specify a format string and supply variables that do not match up with the format, which can deliver unexpected (and unsafe) results. Some compilers are taking a more proactive approach with printf use case analysis, but even then one should be very, very careful when printf is used.
From my man page:
These functions return the number of characters printed (not including
the trailing \0 used to end output to strings) or a negative value
if an output error occurs, except for snprintf() and vsnprintf(), which
return the number of characters that would have been printed if the n
were unlimited (again, not including the final \0).
So, it sounds like the can fail with a negative error.
Yes, output to stdout in C (using printf) is normally line buffered. This means that printf() will collect output until either:
the buffer is full, or
the output contains a \n newline
If you want to force the output of the buffer, call fflush(stdout). This will work even if you have printed something without a newline.
Also printf and friends can fail.
Common implementations of C call malloc() in the printf family of the stdC library.
malloc can fail, so then will printf. In UNIX the write() call can be interrupted by EINTR, so context switching in UNIX will trigger faults (EINTR). Windows can and will do similar things.
And... Although you do not see it posted here often you should always check the return code from any system or library function that returns a value.
Like that, no. It won't always work as you expect, especially if you're using user input as the format string. If the first argument has %s or %d or other format specifiers in it, they will be parsed and replaced with values from the stack, which can easily break if it's expecting a pointer and gets an int instead.
This way tends to be a lot safer:
printf("%s", "....");
The output buffer will be flushed before exit, or before you get input, so the data will make it regardless of whether you send a \n.
printf could fail for any number of reasons. If you're deep in recursion, calling printf may blow your stack. The C and C++ standards have little to say on threading issues and calling printf while printf is executing in another thread may fail. It could fail because stdout is attached to a file and you just filled your filesystem, in which case the return value tells you there was a problem. If you call printf with a string that isn't zero terminated then bad things could happen. And printf can apparently fail if you're using buffered I/O and your buffer hasn't been flushed yet.

Resources