I just happened to look at the prototype of the printf (and other fprintf class of functions) -
int printf(const char * restrict format, ...);
The keyword restrict if I understand correctly disallows access to the same object through two pointers if one of them is marked restrict.
An example that cites the same from the C standard is here.
One benefit of marking the format as restrict I think is saving the function from the chance that the format string might get modified during the execution (say because of the %n format specifier).
But does this impose a bigger constraint? Does this make the following function call invalid?
char format[] = "%s";
printf(format, format);
Because there is clearly an aliasing here. Why was the restrict keyword added to the format argument of printf?
cppreference
During each execution of a block in which a restricted pointer P is declared (typically each execution of a function body in which P is a function parameter), if some object that is accessible through P (directly or indirectly) is modified, by any means, then all accesses to that object (both reads and writes) in that block must occur through P (directly or indirectly), otherwise the behavior is undefined.
(emphasis mine)
It means that:
char format[] = "%s";
printf(format, format);
Is well-defined because printf won't attempt to modify format.
The only thing that restrict makes undefined is 'writing to the format string using %…n while printf is running' (e.g. char f[] = "%hhn"; printf(f, (signed char *)f);).
Why was the restrict keyword added to the format argument of printf?
restrict is essentially a hint the compiler might use to optimize your code better.
Since restrict may or may not make code run faster, but it can never make it slower (assuming the compiler is sane), it should be used always, unless:
Using it would cause UB
It makes no significant performance improvement in this specific case
Why is the format in printf marked as restrict?
int printf(const char * restrict format, ...);
The restrict in some_type * restrict format is a "contract" between the calling code and the function printf(). It allows the printf() to assume the only possible changes to the data pointed to by format occur to what the function does directly and not a side effect of other pointers.
This allows printf() to consist of code that does not concern itself with a changing format string by such side effects.
Since format points to const data, printf() is not also allowed to change the data. Yet this is ancillary to the restrict feature.
Consider pathological code below. It violates the contract as printf() may certainly alter the state of *stdout, which in turn can alter .ubuf.
strcpy(stdout->ubuf, "%s");
printf(stdout->ubuf, "Hello World!\n");
#HolyBlackCat has a good "%n" example.
Key: restrict requires the calling code to not pass as format, any pointer to a string that may change due to printf() operation.
Related
Is it possible to make declaration of variadic function so that it doesn't end with "..."?
Today I learned more about exec from unistd.h but through the day I've seen three (two actually) different declaration of execl:
1) int execl ( const char * path, const char * arg0, ..., (char*)NULL ); was shown to us in school and I imagined I would have to end the function call with a NULL value
2) int execl(const char *path, const char *arg, ... /* (char *) NULL */); is what I've found in the exec(3) man page. That would probably mean I still have to end it with a NULL value, but it is not enforced.
3) int execl(const char *path, const char *arg, ...); is what I found here. This one would probably normally put me to rest with the first one being a simplification for students, the second was a varning and this is the real thing (even though I would probably have normally higher regard for both options one and two.)
But then I found on the same site this declaration:
int execle(const char *path, const char *arg, ..., char * const envp[]);
Same question applies, I was unable to create variadic function not ending in ... with gcc telling me that it's expecting ')' before ',' token pointing to the comma after the three dots.
So finally, is it possible to make variadic functions ending with a NULL characters (execl) and if not, is it possible to make it end with predefined variable (execle)?
I tried to compile with gcc 6.3.1, I also tried --std=c11.
Is it possible to make declaration of variadic function so that it doesn't end with "..."?
Is it possible is a slippery question, but consider these facts:
the standard says that "If a function that accepts a variable number of arguments is defined without a parameter type list that ends with the ellipsis notation, the behavior is undefined" (C2011, 6.9.1/8)
Perhaps that answers the question already, but if you choose to mince words and focus on function declarations that are not definitions, then
a function definition is also a declaration
the C language standard requires all declarations of the same function to be "compatible" (else program behavior is undefined) (C2011 6.7/4)
two function declarations with mismatched parameter lists are not compatible (C2011, 6.2.7/3)
Thus, if you declare a variadic function that in fact is also defined, and that function's parameter list does not end with ..., then the program's behavior is undefined.
The documentation you've been reading for execle() and execl() is written to express and discuss those functions' expectations, but to the extent that it seems to present variadic function declarations in which the last element of the parameter list is not ..., those are not actually valid C function declarations.
So finally, is it possible to make variadic functions ending with a NULL characters (execl) and if not, is it possible to make it end with predefined variable (execle)?
It is not possible to describe such calling conventions via conforming C declarations. Variadic functions can have such expectations, and can enforce them at runtime, but they can be enforced at compile time only by a compiler that relies on special knowledge of the functions involved, or on C language extensions that allow such constraints to be described.
The declaration of a variadic function can only specify the required arguments, and the compiler can enforce their types. The variable-length part never has any type checking done. And the variable-length part is always at the end. The declaration for execle() is not meant as an actual C declaration, but just to describe to the programmer how he should construct the arguments.
It's not possible to enforce that the last argument to execl() is NULL. Variadic functions don't know how many arguments were supplied, they determine it from the values of the arguments. printf() assumes that it has enough arguments to fill in all the operators in the format string, and execl() iterates through the arguments until it finds NULL (execle() is similar, but it reads one additional argument to get envp). If you don't end with NULL, it will just keep going, reading garbage and causing undefined behavior.
The declaration you see is the one in the man pages of execl. The declaration for execle in glib is the following: int execle (const char *path, const char *arg, ...). The implementation assumes the last argument is a char**, and uses it for envp. I don't think you can enforce such a rule in C.
I have read the post sprintf format specifier replace by nothing, and others related, but have not seen this addressed specifically.
Until today, I have never seen sprintf used with only 2 arguments.
The prototype my system uses for sprintf() is:
int sprintf (char Target_String[], const char Format_String[], ...);
While working with some legacy code I ran across this: (simplified for illustration)
char toStr[30];
char fromStr[]={"this is the in string"};
sprintf(toStr, fromStr);
My interpretation of the prototype is that the second argument should be comprised of a const char[], and accepting standard ansi C format specifiers such as these.
But the above example seems to work just fine with the string fromStr as the 2nd argument.
Is it purely by undefined behavior that this works?, or is this usage perfectly legal?
I a working on Windows 7, using a C99 compiler.
Perfectly legal. The variadic arguments are optional.
In this case the printf serves as strcpy but parses the fmt string for % specifiers.
I'd write sprintf(toStr,"%s",fromStr); so it doesn't have to parse that long string.
The behavior you are observing is correct, a format string is not required to have any conversion specifiers. In this case the variable-length argument list, represented by ..., has length of zero. This is perfectly legal, although it's definitely less efficient than its equivalent
strcpy(toStr, fromStr);
It's perfectly legal code, but
If you just want to copy a string, use strcpy() instead.
If you are working with user input, you could be making yourself vulnerable to a format string attack.
Synopsis for sprintf is:
int sprintf(char *str, const char *format, ...);
That means 2 arguments are legal option.
It works because you have no further parameter (ie no control format %) to print.
It's no difference than printf without second parameter:
int printf ( const char * format, ... );
It also works if you don't have any second parameter:
printf(fromStr);
the second argument should be comprised of a const char[]
A const specifier of a function argument guarantees that the function does not change the value of that argument (given it can change it which is the case on arrays because they are passed by address to the function). It does not require that a const value to be used on the actual call.
The code you posted do not use a const string as the second argument to sprintf() but the conversion from non-const to const is implicit; there is no need to worry there.
accepting standard ansi C format specifiers
"accepting" does not mean "requiring". The format string you specified does not contain any format specifier. Accordingly, the function is called with only 2 arguments (no values to format). A third argument would be ignored by sprinf() anyway and many modern compilers would issue an warning about it.
Update: I don't want to start a debate about which compilers are modern and which are not.
It happens that I'm using the default compiler on OSX 10.11 and this what it outputs:
axiac: ~/test$ cc -v
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.3.0
Thread model: posix
axiac: ~/test$ cc -o 1 1.c
1.c:8:25: warning: data argument not used by format string [-Wformat-extra-args]
sprintf(x, "abc\n", n);
~~~~~~~ ^
I'm on programming project 4 from chapter 19 of C programming, A Modern Approach. My code works but I get this warning trying to pass a function returning a void * parameter to printf with conversion specifier %s.
format %s expects argument of type char *, but argument 2 has type void * [-Wformat=]
I can easily get rid of the warning by casting the return type of the function to char *, like
printf("%s\n", (char *) function(param));
but I just want to know why this necessary since type void * is casted to another pointer type automatically.
Compiler is very right to complain in this case.
As per your logic itself, the function returning void * could return a structure pointer casted to void *, but then, %s won't be able to print that, isn't it?
So, if you know what you're doing, you can cast the result, for this case.
Also, as others pointed out, maybe it's worthy to mention that, this warning has nothing to do with the standard specification, as in the standards, there is no restriction of the type of the arguments. (Borrowing Mr. #WhozCraig's words) This warning is basically due to an additional layer of type-checking entirely performed by compiler on it's own, enabled by -Wformat flag in gcc .
As far as the pure language is concerned (not the standard library and its expectations, the actual formal language) you can push anything you want on that argument list (including something utterly incoherent in relating to the requirements of a %s format specifier of some library routine). Of course, unless whatever you pushed ultimately is, in fact, the address of a nullchar terminated sequence of char, printf itself will trapes into undefined behavior at your behest.
The warning you're receiving is based on an additional layer of api-checking within the compiler, not some violation of the language itself. That api checking is matching format specs with types of presented arguments for frequently-used standard library apis such as printf, scanf, etc. Could the author of that warning-check been a little more forgiving and ignore void* arguments for specs expecting pointer-types? Certainly, but the point of the check-feature would dwindle pretty rapidly were that the case. Consider this:
int a = 0;
void *b = &a;
printf("%s\n", b);
If that api-check feature is going to be worth any salt at all it had better bark about that mismatched type, because as far as the language itself is concerned, there is nothing wrong with this code. And that has nothing to do with what evil I just requested it do. As far as the language is concerned, printf is simply this:
int printf(char *format ...);
And the call I setup certainly fulfills that (bad for me, and thankfully, the api-checks of my modern compiler will let me know soon enough there may be a problem).
A pointer is a variable which points to a single memory location.
The number of bytes pointed by the pointer depends on the type of the pointer. So if it is int* then it is interpreted as 4 bytes,if it is a char* it is interpreted as 1 byte.
A void* has no type. So the compiler cant dereference this pointer. So in order for the compiler to understand the memory to be dereferenced we need typecasting here.
The printf function is declared as something like this:
int printf(char *format ...);
Here ... denotes any additional arguments the caller supplied (that is, your string you wanted to print). When printf examines these additional parameters, it uses some low-level code, which has no type safety.
This code cannot determine that the parameter has type void*, and cast it to char* automatically. Instead, if the binary representation of void* and char* is the same, the parameter can be extracted from the ... part without regard to its actual type. If the representation is different, the low-level code will try to print an incorrect value (and probably crash).
Representation of void* and char* is the same for all platforms that I know of, so it's probably safe (if you trust me, that is - please don't!). However, if you compile with gcc -Wall, as some people recommend, all warnings are upgraded to errors, so you should do the casting, as the compiler indicates.
Suppose I have the following function signature:
int printf(const char * restrict format, ... );
Now, I have a string defined as follows:
volatile char command_str[256];
Now, when I want to pass this string to my printf function, I will get the following warning:
Warning 32 [N] 2943 : passing 'volatile char [256]' to parameter of type 'const char *' discards qualifiers C:\P\parameter\parameter.c
I do not want to change the printf signature, the easiest solution to make the warning go away would be
printf((const char*)command_str, .......);
I have a feeling that this is not the best solution. What would be the correct thing to do? I cannot make command_str non-volatile since it is accessed within an interrupt.
the const in printf()'s signature declares a promise printf() makes -- it won't mess with the data pointed to by format (therefore, both char* and const char* variables may be passed in for format).
Now, your array is volatile (and I expect you know the implication of that). The compiler warns you, that this volatility is discarded in printf()'s scope -- you won't get volatile semantics for accesses to format within printf().
As a suggestion what to do, I'd say evaluate whether you really want changes to the data be apparent midst- printf(). I can't see a reason for wanting that, so making a local copy sounds reasonable.
The function are are passing to (printf()) expects the string to be mutable (const * means that printf() will not modify the content, not to be confused!), and the string you are trying to pass will get modified (well, to be precise the pointer to the string) by an interrupt.
How can you be you be sure that an interrupt will not modify the contents of the string between you calling printf() and printf() actually printing...? What prevents the interrupt from happening while printf() is working?
You need to mask interrupts while calling printf() (using ASM {"CLI"} or something more applicable to your platform), or just copy the string you pass to printf():
// point a
char s[256];
strncpy(s, command_str, 256);
// point b
printf("%s", s);
// point c
This will fix the problem for printf(), but now you have a new race condition point a and b. I think you need to refactor your code. You have bigger issues.
One solution might be:
char s[256];
mask_interrupts();
strncpy(s, command_str, 256);
unmask_interrupts();
printf("%s", s");
As you can see from the code snippet below, I have declared one char variable and one int variable. When the code gets compiled, it must identify the data types of variables str and i.
Why do I need to tell again during scanning my variable that it's a string or integer variable by specifying %s or %d to scanf? Isn't the compiler mature enough to identify that when I declared my variables?
#include <stdio.h>
int main ()
{
char str [80];
int i;
printf ("Enter your family name: ");
scanf ("%s",str);
printf ("Enter your age: ");
scanf ("%d",&i);
return 0;
}
Because there's no portable way for a variable argument functions like scanf and printf to know the types of the variable arguments, not even how many arguments are passed.
See C FAQ: How can I discover how many arguments a function was actually called with?
This is the reason there must be at least one fixed argument to determine the number, and maybe the types, of the variable arguments. And this argument (the standard calls it parmN, see C11(ISO/IEC 9899:201x) §7.16 Variable arguments ) plays this special role, and will be passed to the macro va_start. In another word, you can't have a function with a prototype like this in standard C:
void foo(...);
The reason why the compiler can not provide the necessary information is simply, because the compiler is not involved here. The prototype of the functions doesn't specify the types, because these functions have variable types. So the actual data types are not determined at compile time, but at runtime.
The function then takes one argument from the stack, after the other. These values don't have any type information associated with it, so the only way, the function knows how to interpret the data is, by using the caller provided information, which is the format string.
The functions themselves don't know which data types are passed in, nor do they know the number of arguments passed, so there is no way that printf can decide this on it's own.
In C++ you can use operator overloading, but this is an entire different mechanism. Because here the compiler chooses the appropriate function based on the datatypes and available overloaded function.
To illustrate this, printf, when compiled looks like this:
push value1
...
push valueN
push format_string
call _printf
And the prototype of printf is this:
int printf ( const char * format, ... );
So there is no type information carried over, except what is provided in the format string.
printf is not an intrinsic function. It's not part of the C language per se. All the compiler does is generate code to call printf, passing whatever parameters. Now, because C does not provide reflection as a mechanism to figure out type information at run time, the programmer has to explicitly provide the needed info.
Compiler may be smart, but functions printf or scanf are stupid - they do not know what is the type of the parameter do you pass for every call. This is why you need to pass %s or %d every time.
The first parameter is a format string. If you're printing a decimal number, it may look like:
"%d" (decimal number)
"%5d" (decimal number padded to width 5 with spaces)
"%05d" (decimal number padded to width 5 with zeros)
"%+d" (decimal number, always with a sign)
"Value: %d\n" (some content before/after the number)
etc, see for example Format placeholders on Wikipedia to have an idea what format strings can contain.
Also there can be more than one parameter here:
"%s - %d" (a string, then some content, then a number)
Isn't the compiler matured enough to identify that when I declared my
variable?
No.
You're using a language specified decades ago. Don't expect modern design aesthetics from C, because it's not a modern language. Modern languages will tend to trade a small amount of efficiency in compilation, interpretation or execution for an improvement in usability or clarity. C hails from a time when computer processing time was expensive and in highly limited supply, and its design reflects this.
It's also why C and C++ remain the languages of choice when you really, really care about being fast, efficient or close to the metal.
scanf as prototype int scanf ( const char * format, ... ); says stores given data according to the parameter format into the locations pointed by the additional arguments.
It is not related with compiler, it is all about syntax defined for scanf.Parameter format is required to let scanf know about the size to reserve for data to be entered.
GCC (and possibly other C compilers) keep track of argument types, at least in some situations. But the language is not designed that way.
The printf function is an ordinary function which accepts variable arguments. Variable arguments require some kind of run-time-type identification scheme, but in the C language, values do not carry any run time type information. (Of course, C programmers can create run-time-typing schemes using structures or bit manipulation tricks, but these are not integrated into the language.)
When we develop a function like this:
void foo(int a, int b, ...);
we can pass "any" number of additional arguments after the second one, and it is up to us to determine how many there are and what are their types using some sort of protocol which is outside of the function passing mechanism.
For instance if we call this function like this:
foo(1, 2, 3.0);
foo(1, 2, "abc");
there is no way that the callee can distinguish the cases. There are just some bits in a parameter passing area, and we have no idea whether they represent a pointer to character data or a floating point number.
The possibilities for communicating this type of information are numerous. For example in POSIX, the exec family of functions use variable arguments which have all the same type, char *, and a null pointer is used to indicate the end of the list:
#include <stdarg.h>
void my_exec(char *progname, ...)
{
va_list variable_args;
va_start (variable_args, progname);
for (;;) {
char *arg = va_arg(variable_args, char *);
if (arg == 0)
break;
/* process arg */
}
va_end(variable_args);
/*...*/
}
If the caller forgets to pass a null pointer terminator, the behavior will be undefined because the function will keep invoking va_arg after it has consumed all of the arguments. Our my_exec function has to be called like this:
my_exec("foo", "bar", "xyzzy", (char *) 0);
The cast on the 0 is required because there is no context for it to be interpreted as a null pointer constant: the compiler has no idea that the intended type for that argument is a pointer type. Furthermore (void *) 0 isn't correct because it will simply be passed as the void * type and not char *, though the two are almost certainly compatible at the binary level so it will work in practice. A common mistake with that type of exec function is this:
my_exec("foo", "bar", "xyzzy", NULL);
where the compiler's NULL happens to be defined as 0 without any (void *) cast.
Another possible scheme is to require the caller to pass down a number which indicates how many arguments there are. Of course, that number could be incorrect.
In the case of printf, the format string describes the argument list. The function parses it and extracts the arguments accordingly.
As mentioned at the outset, some compilers, notably the GNU C Compiler, can parse format strings at compile time and perform static type checking against the number and types of arguments.
However, note that a format string can be other than a literal, and may be computed at run
time, which is impervious to such type checking schemes. Fictitious example:
char *fmt_string = message_lookup(current_language, message_code);
/* no type checking from gcc in this case: fmt_string could have
four conversion specifiers, or ones not matching the types of
arg1, arg2, arg3, without generating any diagnostic. */
snprintf(buffer, sizeof buffer, fmt_string, arg1, arg2, arg3);
It is because this is the only way to tell the functions (like printf scanf) that which type of value you are passing. for example-
int main()
{
int i=22;
printf("%c",i);
return 0;
}
this code will print character not integer 22. because you have told the printf function to treat the variable as char.
printf and scanf are I/O functions that are designed and defined in a way to receive a control string and a list of arguments.
The functions does not know the type of parameter passed to it , and Compiler also cant pass this information to it.
Because in the printf you're not specifying data type, you're specifying data format. This is an important distinction in any language, and it's doubly important in C.
When you scan in a string with with %s, you're not saying "parse a string input for my string variable." You can't say that in C because C doesn't have a string type. The closest thing C has to a string variable is a fixed-size character array that happens to contain a characters representing a string, with the end of string indicated by a null character. So what you're really saying is "here's an array to hold the string, I promise it's big enough for the string input I want you to parse."
Primitive? Of course. C was invented over 40 years ago, when a typical machine had at most 64K of RAM. In such an environment, conserving RAM had a higher priority than sophisticated string manipulation.
Still, the %s scanner persists in more advanced programming environments, where there are string data types. Because it's about scanning, not typing.