What is the point of format specifier in C? - c

What is the point of format specifier in C if we have allready set the type of variable before printf?
For example:
#include<stdio.h>
int main(void)
{
int a=7
printf("%d", a);
}
Like, it's allready stated what a is, it's integer(int). So what is the point of adding %d to specify that it's an integer?

The answer to this question really only makes sense in the context of C's history.
C is, by now, a pretty old language. Though undoubtedly a "high level language", it is famously low-level as high-level languages go. And its earliest compiler was deliberately and self-consciously small and simple.
In its first incarnation, C did not enforce type safety during function calls. For example, if you called sqrt(144), you got the wrong answer, because sqrt expects an argument of type double, but 144 is an int. It was the programmer's responsibility to call a function with arguments of the correct types: the compiler did not know (did not even attempt to keep track of) the arguments expected by each function, so it did not and could not perform automatic conversions. (A separate program, lint, could check that functions were called with the correct arguments.)
C++ corrected this deficiency, by introducing the function prototype. These were inherited by C in the first ANSI C standard in 1989. However, a function prototype only works for a function that expects a single, fixed argument list, meaning that it can't help for functions that accept a variable number of arguments, the premier example being: printf.
The other thing to remember is that, in C, printf is a more or less ordinary function. ("Ordinary" other than accepting a variable number of arguments, that is.) So the compiler has no direct mechanism to notice the types of the arguments and make that list of types available to printf. printf has no way of knowing, at run time, what types were passed during any given call; it can only rely (it must rely) on the clues provided in the format string. (This is by contrast to languages, many of them, where the print statement is an explicit part of the language parsed by the compiler, meaning that the compiler can do whatever it needs to do in order to treat each argument properly according to its known type.)
So, by the rules of the language (which are constrained by backwards compatibility and the history of the language), the compiler can't do anything special with the arguments in a printf call, other than performing what is called the default argument promotions. So the compiler can't fix things (can't perform the "correct" implicit conversion) if you write something like
int a = 7;
printf("%f", a);
This is, admittedly, an uncomfortable situation. These days, programmers are used to the protections and the implicit promotions provided for by function prototypes. If, these days, you can call
int x = sqrt(144);
and have the right thing happen, why can't you similarly call
printf("%f\n", 144);
Well, you can't, although a good, modern compiler will try to help you out anyway. Although the compiler doesn't have to inspect the format string (because that's printf's job to do, at run time), and the compiler isn't allowed to insert any implicit conversions (other than the default promotions, which don't help here), a compiler can duplicate printf's logic, inspect the format string, and issue strong warnings if the programmer makes a mistake. For example, given
printf("%f\n", 144);
gcc prints "warning: format ‘%f’ expects argument of type ‘double’, but argument 2 has type ‘int", and clang prints "warning: format specifies type 'double' but the argument has type 'int'".
In my opinion, this is a fine compromise, balancing C's legacy behavior with modern expectations.

what is the point of adding %d to specify that it's an integer?
printf() is a function which receives a variable number of arguments of various type after the format argument. It does not directly know the number nor the type of arguments passed nor received.
The callers knows the argument count and types it gives to printf().
To pass the arguments count and type information, the format argument is used by the caller to encodes the argument count and types. printf() uses that format and decodes it to know the argument count and type. It is very important that the format and following arguments passed are consistent.

printf() accepts a variable number of arguments. To process those variable arguments it (va_start()) needs to know the last fixed argument is. It (va_arg()) also needs to know the type of each argument so it figure how much data to read.
The format specifier is also a compact template (or DSL) to express how text and variables should be formatted including field width, alignment, precision, encoding.

Related

Choose the lesser evil incorrect Printf() statements: Fewer parameters vs extra parameters

A. printf("Values: X=%s Y=%s\n", x,y,z);
B. printf("Values: x=%s, Y=%s\n", x);
Both of the above printf() statements are incorrect: one has extra parameters, other has fewer parameters. I would like to choose between the lesser evil with an explanation. Can a modern C compiler help catch such problems? If yes, how does printf() implementor need to assist the compiler?
Both of the above printf() statements are incorrect: one has extra parameters, other has fewer parameters.
The first one is not incorrect according to the C standard. The rules for function calls in general, in C 2018 6.5.2.2, do not make it an error to pass unused arguments for a ... in the function prototype. For printf specifically, C 2018 7.21.6.1 2 (about fprintf, which the specification for printf refers to) says extra arguments are harmless:
… If the format is exhausted while arguments remain, the excess arguments are evaluated (as always) but are otherwise ignored…
Certainly if a programmer writes printf("Values: X=%s. Y=%s.\n", x, y, z);, they might have made a mistake, and a compiler would be reasonable in pointing out this possibility. However, consider code such as:
printf(ComputedFormat, x, y, z);
Here it is reasonable that we wish to print different numbers of values in different circumstances, and the ComputedFormat reflects this. It would be tedious to write code for each case and dispatch to them with a switch statement. It is simpler to write one call and let the computed format determine how many values are printed. So it is not always an error to have more arguments than the conversion specifications use.
I would like to choose between the lesser evil with an explanation.
The behavior of the latter code is not defined by the C standard. C 2018 7.21.6.1 2 also says:
… If there are insufficient arguments for the format, the behavior is undefined…
Thus, no behavior may be relied on from the latter code, unless there is some guarantee from the C implementation.
Can a modern C compiler help catch such problems?
Good modern C compilers have information about the specification of printf and, when the format argument is a string literal, they compare the number and types of the arguments to the conversion specifications in the string.
If yes, how does printf() implementor need to assist the compiler?
The implementor of printf does not need to do anything except conform to the specification of printf in the C standard. The aid described above is performed by the C compiler with reference to the C standard; it does not rely on features of the particular printf implementation.
In some platforms, information about the number of arguments passed is provided to the called routine. In such platforms, a printf implementor could check whether too few arguments are provided and signal an error in some method.
Eric Postpischil has already made a great answer that uses the most reliable source (the C standard), but I just want to post my own answer about why printf may behave as it does in both cases.
printf is a variadic function which can take a variable number of arguments. The way it knows how many you have passed is solely through the format string; every time it finds a format specifier, it takes the next argument out of the list (and assumes its type from which specifier has been used). Nothing would really happen to any extra arguments because since there is no specifier for them, the function will not even try to take them and they will not be printed. So you may be warned about the extra arguments by the compiler, but the behavior in the first example is well-defined.
The second, on the other hand, is definitely undefined behavior. Since there are not enough arguments to match the number of format specifiers in the string, eventually when it finds the second %s, it will try to take the next variadic argument, but the issue is that you haven't passed any. When this happens for me, it prints some garbage value in place of the format specifier that doesn't look too nice. Anything could happen in undefined behavior though. In this case, the function seems to try to take the next variadic argument from a CPU register / the stack (memory) and fetches some garbage value that happened to be there (though again, anything could happen with undefined behavior).
So in short:
printf("%s\n", "Hello", "World");
| | ^^^^^^^ Ignored
-------
and
printf("%s\n"); ?
| |
----------

Abnormal beheviour of sqrt() in c while using in printf()

I am new to programming. I was finding a square root of a number using sqrt()function in c. `
scanf("%d", &n);
printf("%d\n", sqrt(n));
}
return 0;
}
When I enter a value of n = 5, I got some negative large number. Can anyone explain, please?
You've produced undefined behavior by passing the wrong type to printf: the %d format required a matching argument of type int but your argument has type double. You need %f (or %e or %g or %a) to print it. Also, there may be other problems, e.g. if you omitted #include <math.h>.
As others have pointed out, the problem here is that the format specifier is wrong. You need to #include <math.h> to get the proper return type of sqrt(), then use a format specifier like %f. Also, turn up your compiler warnings until it tells you something was wrong here. -Wall -Wextra -pedantic -Wno-system-headers is a good choice.
I’m adding an answer, though, to provide historical background on why float variables get promoted to double in printf() argument lists, but not scanf(), since this confused people in the comments.
In the instruction set of the DEC PDP-10 and PDP-11 computers, on which C was originally developed, the float type existed only to save space, and a program needed to convert a float to double to do any calculations on it. In early versions of C, before ANSI function prototypes, all float arguments to a function were promoted to double automatically before being passed (and also char to int). Originally, this ran better at a low level, and also had the advantage of avoiding round-off and overflow error on math using the shorter types. This convention also simplified writing functions that took a varying number of arguments of varying types, such as printf(). The caller could just pass anything in, the compiler would let it, and it was the called function’s job to figure out what the argument list was supposed to be at runtime.
When C added function prototypes, these old rules were kept for backward-compatibility only with legacy function declarations (extern double sqrt() rather than extern double sqrt(double) or the C14 generic equivalent). Since basically nobody writes functions that way any more, this is a historic curiosity—with one exception. A varargs function like int printf(const char*, ...); cannot be written in C with type checking of the variable arguments. (There is a C++14 way to do this using templates.) The standards committee also did not want to break all existing code that printed a float. So those are still promoted according to the old rules.
In scanf(), none of this applies because the storage arguments are passed by reference, and scanf() needs to be sure it’s writing the data in the same type as the variable that holds it. Argument-promotion never comes into play, because only pointers are ever passed.
I meet the same problem. And I want to get an answer with int type. I use forced type conversion, like:
printf("%d\n", (int)sqrt(n));
this is happens because return type and input type of the sqrt() is not specified ,
you can solve this by either including the header file by :
#include<math.h>
or by explicitly specifying the return and input type like this :
double sqrt(double);
and also as mentioned above use correct format specifiers (eg : %f) .

Why random integer is outputed when more '%' conversions than data arguments error occurs in c?

I take out the age variable from the printf() call just to see what happens. I then compile it with make. It seems it only throws warning about more % conversions than data arguments and unused age variable but no compile error. I then run the executable file and it does run. Only every time I run it, it returns different random integer. I'm wondering what causes this behavior?
#include <stdio.h>
int main(int argc, char *arg[]) {
int age = 10;
int height = 72;
printf("I'm %d years old\n");
printf("I'm %d inches tall\n", height);
return 0;
}
As per the printf() specification, if there are insufficient number of arguments for the required format specifier, it invokes undefined behavior.
So, your code
printf("I'm %d years old\n");
which is missing the required argument for %d, invokes UB and not guaranteed to produce any valid result.
Cross reference, C11 standard, chapter §7.21.6.1
[..] If there are insufficient arguments for the format, the behavior is
undefined. [..]
According to the C Standard (7.21.6.1 The fprintf function - the same is valid for printf)
...If there are insufficient arguments for the format, the behavior is undefined. If the format is exhausted while arguments
remain, the excess arguments are evaluated (as always) but are
otherwise ignored.
The printf using cdecl, which using stack arguments. If you implied to the function that you are using one argument, it will be pulled out of the runtime stack, and if you didn't put there your number, the place will probably contain some garbage data. So the argument which will be printed is some arbitrary data.
With only one exception I know of, the C Standard imposes no requirements with regard to any action which in some plausible implementations might be usefully trapped. It is not hard to imagine a C compiler passing a variadic function like printf an indication of what arguments it has passed, nor would it be hard to an implementer thinking that it could be useful to have the compiler trigger a trap if code tries to retrieve a variadic parameters of some type when the corresponding argument is some other type or doesn't exist at all. Because it could be useful to have compilers trap in such cases, and because the behavior of such a trap would be outside the jurisdiction of the Standard, the Standard imposes no requirements about what may or may not happen when a variadic function tries to receive arguments which weren't passed to it.
In practice, rather than letting variadic functions know how many arguments they've received, most compilers simply have conventions which describe a relationship between the location of the non-variadic argument and the locations of subsequent variadic arguments. The generated code won't know whether a function has received e.g. two arguments of type int, but it will know that each such argument, if it exists, will be stored in a certain place. On such a compiler, using excess format specifiers will generally result in the generated code looking at the places where additional arguments would have been stored had they existed. In many cases, this location will have been used for some other purpose and then abandoned, and may hold the last value stored there for that purpose, but there is generally no reason to expect anything in particular about the contents of abandoned memory.

Output prediction

what shall be the output of: (and why?)
printf("%d",2.37);
Apparently, printf is a variadic function and we can never know the type of a variable argument list. so we always have to specify the format specifiers manually.
so, 2.37 would be stored as double according to IEEE standards would be fetched and printed in integer format.
But the output is 0.
What is the reason?
It is undefined behavior. You're passing a double argument to a function that expects to retrieve an int from its varargs macros, and there's no telling at all what that is going to lead to. In theory, it may even crash (with a calling convention that specifies that variadic arguments of different types are passed in different ways or on different stacks).

Why strlen function works without #include<string.h>?

Quick question:
strlen[char*] works perfectly regardless whether I #include <string.h> or not
All I get from compiler is a warning about implicit declaration, but functionally it works as intended.
Why is that?
When you invoke undefined behavior, one possible behavior is that the program behaves as you expected it to, one your system, and with the current version of libraries and system software you have installed. This does not mean it's okay to do this. Actually a correct C99 compiler should not allow implicit function declarations; it should give you an error.
The function prototypes in C are not compulsory. They're useful indications to the compiler so that it can do type checking on types which are passed into them. When you don't include string.h, a default signature is assumed for the function which is why you get the warning.
If a function is called without a prototype in scope, the compiler will generate code to call a function which accepts whatever types of parameters are passed and accept an integer result. If the parameters match those the function expects, and if the function returns its result in a way the calling code can handle(*), all will be well. The purpose of prototyping is to ensure that arguments get converted into expected types if possible, and that compilation will fail if they cannot be converted. For example, if a non-prototyped function expects an argument of type 'long' and one attempts to pass an 'int', any of the following may occur:
The program may crash outright
Things may work as expected
The function may execute as though it were passed some arbitrary different parameter value
The program may continue to run, but with arbitrary values corrupting any or all program variables.
The computer may cause demons may fly out the programmer's nose
By contrast, if the function were prototyped, the compiler would be guaranteed to do whatever was necessary to convert the 'int' to a 'long' prior to calling the function.
When C was originally conceived, prototypes didn't exist, and the programmer was responsible for ensuring that all arguments were passed with the precise types expected. In practice, it's a real pain to ensure that all function arguments are always the exact proper types (e.g. when passing the value five to a function that expects a long, one must write it as either "5L" or "(long)5"). Realistically speaking, there's never(**) any reason to rely upon implicit argument types, except with variadic functions.
(*) Any of the things that can happen with incorrect parameter types can happen with incorrect return types, except when a function is expected to return 'int' and the actual return value of the function would fit in an 'int', the results are more likely to be correct than when incorrect parameter types are used.
(**) The only exceptions I can think of would be if one was programming for some really old hardware for which a prototype-aware compiler was unavailable, or if one was programming for a code-golf or similar competition. The latter I consider puzzle-solving rather than programming, and I'm unaware of any hardware people would be interested in using for which the former condition would apply.
Because it's declaration ie equal to so called 'default declaration'. Compiler expects any unknown function to return int and expect parameters as passed at the first time of function usage in code.
Usually this is because another header file which you have included ALSO includes string.h. Obviously it is bad practice to assume that you don't need to include something just because something else does, but it is most likely responsible for this effect.
I guessing it's because in C an int is the default data type returned from a function.
Can you give a fuller code example.
The function prototype is there in include files. So even if you don't include those files, a fixed prototype int function_name(); is written. Now the code of strlen() is there in the library files which are linked at run time, so the function gives correct output (only if it is the only function with a fixed prototype int function_name();).

Resources