Let's take the following five examples:
// OK, most correct
printf("%10.4hd XXX", (short) 2);
// OK, no warning
printf("%10.4d XXX", (short) 2);
// Error: [-Werror,-Wformat]
printf("%10.4hhd XXX", (short) 2);
// Error: [-Werror,-Wformat]
printf("%10.4ld XXX", (short) 2);
// Error: [-Werror,-Wformat]
printf("%10.4f XXX", (short) 2);
Why, for example, does the second one work fine, but the third, fourth, and fifth ones do not?
In C, the expression (short)2 is subject to the integer promotion rules before it's passed to printf. That means it becomes an int.
printf provides the h length specifier, but the value it consumes will be an int, not a short because in C there's no way to pass a short value directly to a var-arg function.
From ISO/IEC 9899:TC3 section 6.19.6.1.7 on the h length specifier:
Specifies that a following d, i, o, u, x, or X conversion specifier
applies to a short int or unsigned short int argument (the argument
will have been promoted according to the integer promotions, but its
value shall be converted to short int or unsigned short int before
printing); or that a following n conversion specifier applies to a
pointer to a short int argument.
The %hhd format string should also work for similar reasons. I guess the reason that clang warns for this one and not for "%d", (short) is that the latter is technically correct and there's a lot of C code that uses %d for printing shorts.
The %ld is probably always undefined behavior, but may or may not work in practice depending on whether int and long have the same representation (typically this would be if int and long are both 64 bits).
The %f is always undefined behavior. %f expects a double argument (not a float, since floats are always promoted to double when passed to a var-args function), and you've given it an int.
From this site:
When a function with a variable-length argument list is called, the
variable arguments are passed using C's old ``default argument
promotions.'' These say that types char and short int are
automatically promoted to int, and type float is automatically
promoted to double. Therefore, varargs functions will never receive
arguments of type char, short int, or float.
printf is a function with a variable-length argument list. So when you pass in a short, it gets promoted to an int.
Now, even though printf will never receive a short, you can tell it that you started with one. So %...hd tells printf that you passed in a short and it has been converted to an integer, and printf will try to do the right thing with it (convert it back to a short internally).
%...hhd works the same way but you use it when you pass in a char and it gets promoted to an int. I guess that the compiler doesn't allow %...hhd in your case because it is smart enough to notice that you didn't pass in a char (you passed in a short).
%...d works because printf sees an int (and the compiler has decided that it doesn't mind that you didn't use hd).
%...ld and %...f don't work because you didn't pass in a long or a float, and a short doesn't get promoted to either of those types.
Related
In this case,
#include <stdio.h>
int main()
{
unsigned char a = 1;
printf("%hhu", -a);
return 0;
}
The argument -a in printf is promoted to int by the integer promotion by the unary minus operator and subsequently promoted by the default argument promotion and finally converted to unsigned char by the format specifier.
So -a => -(int)a(by ~) => no conversion by function call => (unsigned char)-(int)a(by %hhu). Is my thought right?
You are correct that a is promoted to int in -a, and that printf("%hhu", -a); passes an int to printf. The notional conversion performed with %hhu is not clear.
Note that if a is not zero, then -a produces a value (in an int) that is not an unsigned char value. Further, with two’s complement eight-bit signed char, if a is greater than 128, then -a produces a value that is not a signed char value.
To understand %hhu, we look at the specification for u in C 2018 7.21.6.1 8:
The unsigned int argument is converted to unsigned octal (o), unsigned decimal (u),…
and for hh in 7.21.6.1 7:
Specifies that a following d, i, o, u, x, or X conversion specifier applies to a signed char or unsigned char argument (the argument will have been promoted according to the integer promotions, but its value shall be converted to signed char or unsigned char before printing);…
First we have to resolve this issue of “signed char or unsigned char”. Does this say we can pass either a signed char or an unsigned char for %hhu? I think not; I think the authors have just put together the language for %hhd (intended to convert a signed char) and %hhu (intended to convert an unsigned char). So I believe the intent is that a promoted unsigned char should be passed for the %hhu conversion specification.
Apple Clang 11.0.0 seems to agree, when passing -a (but not a), it warns: “warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat]”
As noted above, passing -a may pass a value that cannot result from passing a promoted unsigned char. It may even pass a value that cannot result from passing a promoted signed char or unsigned char. In this case, it can be argued we have violated the requirement to pass an unsigned char, and therefore the C standard does not specify the resulting behavior. Even though it says the passed value shall be converted to an unsigned char, I believe that is a notional conversion, not a specific requirement on the library implementation, and that is also falls under the “as if” rules: It does not actually have to be performed if the resulting defined behavior of programs is the same. But, since passing an improper value may not be defined, we do not have defined behavior.
That may be a strict reading of the rules, but it would not surprise me greatly if printf printed “4294967295” instead of “255” when a were 1.
printf is a variadic function. The type of the arguments passed by ... parameter are not known inside the function. As such, any variadic function must rely on other mechanisms to interpret the type of the va_args arguments. printf and family use a const char* format string to "tell them" what kind of arguments were passed. Passing a type different then the expected type as specified by it's format specifier results in Undefined Behavior.
For instance:
printf("%f", 24)
Is undefined behavior. There is no conversion from int to float anywhere because the arguments are passed as they are (after promotion) and inside the printf the function incorrectly treats its first argument as float. printf does not know and can't know that the real type of the argument is int.
Variadic arguments undergo some promotions of their own. Of interest for your question unsigned char is promoted to int or unsigned int (I am not sure tbo). As such there is no way for a variadic parameter to actually be of type unsigned char. So hhu while is indeed the specifier for unsigned char it will actually expect an unsigned int (int), which is what you pass to it.
So afaik the code is safe because of the two integer promotions caused by unary minus and passing variadic arguments. I am not 100% sure though. Integer promotions are weird and complicated.
#include<stdio.h>
int main() {
long a = 9;
printf("a = %d",a);//output is 9 but with a warning 'expecting long int'
}
Why can't long here be converted to int?
Variadic functions in general and printf family in particular, are odd special cases. They are notorious for their non-existent type safety, so if you pass the wrong type or use the wrong format string, you invoke undefined behavior and anything can happen.
In your case, most likely int and long happen to have the same representation so the program works despite the warning.
In the case of a regular function though, there is a kind of "demotion" taking place if you pass a larger integer type to a function expecting a smaller one. When this happens, you trigger a conversion from the larger type to the smaller, which is well-defined. (The result will however be compiler-specific if you mix types of different signedness.)
Compilers tend to warn against such implicit conversions, so it is better to do the conversion explicitly with a cast.
Because that's the way variadic functions behave in C language. printf is just a function from the standard library and has no special processing. It is declared as
int printf(const char restrict *fmt, ...);
And the standard (n1256 draft for C99) says (emphasize mine):
6.5.2.2 Function calls...
6 If the expression that denotes the called function has a type that does not include a
prototype, the integer promotions are performed on each argument, and arguments that
have type float are promoted to double. These are called the default argument
promotions...
7 ... The ellipsis notation in a function prototype declarator causes
argument type conversion to stop after the last declared parameter. The default argument
promotions are performed on trailing arguments.
That means that on all parameters to printf, float are converted to double and integer promotions occur on integral arguments.
And in 6.3.1.1 Arithmetic operands / Boolean, characters, and integers
2 The following may be used in an expression wherever an int or unsigned int may
be used:
— An object or expression with an integer type whose integer conversion rank is less
than or equal to the rank of int and unsigned int.
— A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type, the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the integer
promotions.48) All other types are unchanged by the integer promotions.
So as long has a rank greater than int, it is left unchanged by an integer promotion, and the format shall be adapted to accept a long:
long a = 9;
printf("a = %ld",a);
Following passes a long, yet printf() expects an int due to "%d". Result: undefined behavior (UB). If int and long are the same size that UB might look like everything is OK, or it may fail. It is UB.
long a = 9;
printf("a = %d",a); // UB
Why can't long here be converted to int?
long can be converted, yet code did not direct that like the below code.
printf("a = %d", (int) a); // OK
Can integer promotion happen in the reverse order ... in variadic functions like printf()?
Integer promotions do not happen in the reverse order unless code explicitly down-casts or assigned to a narrower type.
There are cases where demotion will appear to be true with printf().
The below promotes sc to int as it is passed to printf(). printf() will take that int and due to "%hhd" will convert it to signed char and then print that numeric value.
signed char sc = 1;
printf("a = %hhd", cs); // prints 1
The below passes i to printf() as an int. printf() will take that int and due to "%hhd" will convert it to signed char and then print that numeric value. So in this case it looks like i was demoted.
int i = 0x101;
printf("a = %hhd", i); // prints 1
The printf family of functions provide a series of length modifiers, two of them being hh (denoting a signed char or unsigned char argument promoted to int) and h (denoting a signed short or unsigned short argument promoted to int). Historically, these length modifiers have only been introduced to create symmetry with the length modifiers of scanf and are rarely used for printf.
Here is an excerpt of ISO 9899:2011 §7.21.6.1 “The fprintf function” ¶7:
7 The length modifiers and their meanings are:
hh Specifies that a following d, i, o, u, x, or X conversion specifier applies to a signed char or unsigned char argument (the argument will have been promoted according to the integer promotions, but its value shall be converted to signed char or unsigned char before printing); or that a following n conversion specifier applies to a pointer to a signed char
argument.
h Specifies that a following d, i, o, u, x, or X conversion specifier applies to a short int or unsigned short intargument (the argument will have been promoted according to the integer promotions, but its value shall be converted to short int or unsigned short int before printing); or that a following n conversion specifier applies to a pointer to a short int argument.
...
Ignoring the case of the n conversion specifier, what do these almost identical paragraphs say about the behaviour of h and hh?
In this answer, it is claimed that passing an argument that is outside the range of a signed char, signed short, unsigned char, or unsigned short resp. for a conversion specification with an h or hh length modifier resp. is undefined behaviour, as the argument wasn't converted from type char, short, etc. resp. before.
I claim that the function operates in a well-defined manner for every value of type int and that printf behaves as if the parameter was converted to char, short, etc. resp. before conversion.
One could also claim that invoking the function with an argument that was not of the corresponding type before default argument promotion is undefined behaviour, but this seems abstruse.
Which of these three interpretations of §7.21.6.1¶7 (if at all) is correct?
The standard specifies:
If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
[C2011 7.21.6.1/9]
What is meant by "the correct type", is conceivably open to interpretation, but the most plausible interpretation to me is the type that the conversion specification "applies to" as specified earlier in the same section, and as quoted, in part, in the question. I take the parenthetical comments about argument promotion to be acknowledging the ordinary argument-passing rules, and avoiding any implication of these functions being special cases. I do not take the parenthetic comments as relevant to determining the "correct type" of the argument.
What actually happens if you pass an argument of wider type than is correct for the conversion specification is a different question. I am inclined to believe that the C system is unlikely to be implemented by anybody such that it makes a difference whether a printf() argument is actually a char, or whether it is an int whose value is in the range of char. I assert, however, that it is valid behavior for the compiler to check argument type correspondence with the format, and to reject the program if there is a mismatch (because the required behavior in such a case is explicitly undefined).
On the other hand, I could certainly imagine printf() implementations that actually misbehave (print garbage, corrupt memory, eat your lunch) if the value of an argument is outside the range implied by the corresponding conversion specifier. This also is permissible on account of the behavior being undefined.
The book says the following on page 45:
Since an argument of a function call is an expression, type conversions also take place when arguments are passed to functions. In the absence of a function prototype, char and short become int, and float becomes double. This is why we have declared function arguments to be int and double even when the function is called with char and float.
I don't understand what the last sentence there is saying. Can someone lead me in the right direction?
We can see that happen here. According to cplusplus.com, this is the declaration of printf():
int printf(const char * format, ...);
The ... means this function can take an unknown number of parameters of unspecified types, and because it is unspecified, the standardization of numeric types to int and double happens to all printf() parameters except the first, that was specified.
Example:
char x = 10;
short y = 100;
int z = 1000;
printf("Values of char is %d, short is %d, and int is %d", x, y, z);
All those integer types are automatically recasted to int when passed to printf(). We can see that as %d works for all of them.
Note that types bigger than double and int are not converted, such as long int, long double, long long etc. Those types are 64-bits.
When you use a prototype for a function in C (ansi C, as original K&R specification didn't define parameters this way) you declare a formal parameter as having a type. When you match it in an actual expression, two things can happen:
The formal parameter and the actual expression are the same type. In this case, every thing is fine and the expression value is used to initialize the parameter prior to call the function.
The formal parameter and the actual expression are not the same type. In that case, the compiler tries to do automatic type conversion if possible from the type of the actual expression to the formal parameter type.
In case no prototype is found, the rules you put above mandate, so chars and shorts get promoted to int values, and al the floating type values get promoted to double.
The last phrase in your quoted paragraph tells you that in some example (not shown) that types are being used for formal parameters to make sure actual expressions get converted to the types of formal parameters.
int main()
{
int x,y;
int z;
char s='a';
x=10;y=4;
z = x/y;
printf("%d\n",s); //97
printf("%f",z); //some odd sequence
return 0;
}
in the above piece of code the char s is automatically converted to int while printing due to the int type in control string, but in the second case the int to float conversion doesn't happen. Why so?
In both cases the second argument is promoted to int. This is how variadic functions work, and has nothing to do with the format string.
The format string is not even looked at by the compiler: it's just an argument to some function. Well, a really helpful compiler might know about printf() and might look at the format string, but only to warn you about mistakes you might have made. In fact, gcc does just that:
t.c:9: warning: format ‘%f’ expects type ‘double’, but argument 2 has type ‘int’
It is ultimately your responsibility to ensure that the variadic arguments match the format string. Since in the second printf() call they don't, the behaviour of the code is undefined.
Functions with variable number of arguments follow the rule of the default argument promotion. Integer promotion rules are applied on arguments of integer types and float arguments are converted to double.
printf("%d\n",s);
sis a char and is converted to int.
printf("%f",z);
z is already an int so no conversion is performed on z
Now the conversion specifier f expects a double but the type of the object after the default argument promotion is an int so it is undefined behavior.
Here is what C says on arguments of library functions with variable number of arguments
(C99, 7.4.1p1) "If an argument to a function has [...] a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined."
The char is not being promoted to int due to the control string. The char is working as an int because all data that is less than 4 bytes when passed to printf is bumped up to 4 bytes, which is the size of an int, because of the cdecl calling convention of variadic functions (the point of this is so that the data that comes next will be aligned on a 4-byte boundary on the stack).
printf is not type-safe and has no idea what data you really pass it; it blindly reads the control string and extracts a certain number of bytes from the stack based on what sequences it finds, and interprets that set of bytes as the datatype corresponding to the control sequence. It doesn't perform any conversions, and the reason you are getting some wierd printout is because the bits of an int are being interpreted as the bits of a float.
due to the int type in control string
That is incorrect. It is being converted because shorter int types are promoted to int by the var_args process. Int types are not converted to float types because the va/preprocessor doesn't know what formats are expected.