In section 7.19.6.1 paragraph 8 of the C99 standard:
c If no l length modifier is present, the int argument is converted to an
unsigned char, and the resulting character is written.
In section 7.19.6.1 paragraph 9 of the C99 standard:
If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
Does the fprintf function require an int argument?
For example, would passing an unsigned int result in undefined behavior:
unsigned int foo = 42;
fprintf(fp, "%c\n", foo); /* undefined behavior? */
This worries me since an implementation could have defined char as having the same behavior as unsigned char (section 6.2.5 paragraph 15).
For these cases integer promotion may dictate that the char to be promoted to unsigned int on some implementations. Thus leaving the following code to risk undefined behavior on those implementations:
char bar = 'B';
fprintf(fp, "%c\n", bar); /* possible undefined behavior? */
Are int variables and literal int constants the only safe way to pass a value to fprintf with the %c specifier?
%c conversion specification for fprintf requires an int argument. The value has to be of type int after the default argument promotions.
unsigned int foo = 42;
fprintf(fp, "%c\n", foo);
undefined behavior: foo has to be an int.
char bar = 'B';
fprintf(fp, "%c\n", bar);
not undefined behavior:bar is promoted (default argument promotions) to int as fprintf is a variadic function.
EDIT: to be fair, there are still some very rare implementations where it can be undefined behavior. For example, if char is an unsigned type with not all char values representable in an int (like in this implementation), the default argument promotion is done to unsigned int.
Yes, printf with "%c" requires an int argument -- more or less.
If the argument is of a type narrower than int, then it will be promoted. In most cases, the promotion is to int, with well defined behavior. In the very rare case that plain char is unsigned and sizeof (int) == 1 (which implies CHAR_BIT >= 16), a char argument is promoted to unsigned int, which can cause undefined behavior.
A character constant is already of type int, so printf("%c", 'x') is well defined even on exotic systems. (Off-topic: In C++, character constants are of type char.)
This:
unsigned int foo = 42;
fprintf(fp, "%c\n", foo);
strictly speaking has undefined behavior. N1570 7.1.4p1 says:
If an argument to a function has ... a type (after promotion) not
expected by a function with variable number of arguments, the behavior
is undefined.
and the fprintf call clearly runs afoul of that. (Thanks to ouah for pointing that out.)
On the other hand, 6.2.5p6 says:
For each of the signed integer types, there is a corresponding (but
different) unsigned integer type (designated with the keyword
unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements.
and 6.2.5p9 says:
The range of nonnegative values of a signed integer type is a subrange
of the corresponding unsigned integer type, and the representation of
the same value in each type is the same.
with a footnote:
The same representation and alignment requirements are meant to imply
interchangeability as arguments to functions, return values from
functions, and members of unions.
The footnote says that function arguments of types int and unsigned int are interchangeable, as long as the value is within the representable range of both types. (For a typical 32-bit system, that means the value has to be in the range 0 to 231-1; int values from -231 to -1, and unsigned int values from 231 to 232-1, are outside the range of the other type, and are not interchangeable.)
But footnotes in the C standard are non-normative. They are generally intended to clarify requirements stated in the normative text, not to impose new requirements. But the normative text here merely states that corresponding signed and unsigned types have the same representation, which doesn't necessarily imply that they're passed the same way as function arguments. In principle, a compiler could ignore that footnote and, for example, pass int and unsigned int arguments in different registers, making fprintf(fp, "%c\n", foo); undefined.
But in practice, there's no reason for an implementation to play that kind of game, and you can rely on fprintf(fp, "%c\n", foo); to work as expected. I've never seen or heard of an implementation where it wouldn't work.
Personally, I prefer not to rely on that. If I were writing that code, I'd add an explicit conversion, via a cast, just so these questions don't arise in the first place:
unsigned int foo = 42;
fprintf(fp, "%c\n", (int)foo);
Or I'd make foo an int in the first place.
Related
In this case,
#include <stdio.h>
int main()
{
unsigned char a = 1;
printf("%hhu", -a);
return 0;
}
The argument -a in printf is promoted to int by the integer promotion by the unary minus operator and subsequently promoted by the default argument promotion and finally converted to unsigned char by the format specifier.
So -a => -(int)a(by ~) => no conversion by function call => (unsigned char)-(int)a(by %hhu). Is my thought right?
You are correct that a is promoted to int in -a, and that printf("%hhu", -a); passes an int to printf. The notional conversion performed with %hhu is not clear.
Note that if a is not zero, then -a produces a value (in an int) that is not an unsigned char value. Further, with two’s complement eight-bit signed char, if a is greater than 128, then -a produces a value that is not a signed char value.
To understand %hhu, we look at the specification for u in C 2018 7.21.6.1 8:
The unsigned int argument is converted to unsigned octal (o), unsigned decimal (u),…
and for hh in 7.21.6.1 7:
Specifies that a following d, i, o, u, x, or X conversion specifier applies to a signed char or unsigned char argument (the argument will have been promoted according to the integer promotions, but its value shall be converted to signed char or unsigned char before printing);…
First we have to resolve this issue of “signed char or unsigned char”. Does this say we can pass either a signed char or an unsigned char for %hhu? I think not; I think the authors have just put together the language for %hhd (intended to convert a signed char) and %hhu (intended to convert an unsigned char). So I believe the intent is that a promoted unsigned char should be passed for the %hhu conversion specification.
Apple Clang 11.0.0 seems to agree, when passing -a (but not a), it warns: “warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat]”
As noted above, passing -a may pass a value that cannot result from passing a promoted unsigned char. It may even pass a value that cannot result from passing a promoted signed char or unsigned char. In this case, it can be argued we have violated the requirement to pass an unsigned char, and therefore the C standard does not specify the resulting behavior. Even though it says the passed value shall be converted to an unsigned char, I believe that is a notional conversion, not a specific requirement on the library implementation, and that is also falls under the “as if” rules: It does not actually have to be performed if the resulting defined behavior of programs is the same. But, since passing an improper value may not be defined, we do not have defined behavior.
That may be a strict reading of the rules, but it would not surprise me greatly if printf printed “4294967295” instead of “255” when a were 1.
printf is a variadic function. The type of the arguments passed by ... parameter are not known inside the function. As such, any variadic function must rely on other mechanisms to interpret the type of the va_args arguments. printf and family use a const char* format string to "tell them" what kind of arguments were passed. Passing a type different then the expected type as specified by it's format specifier results in Undefined Behavior.
For instance:
printf("%f", 24)
Is undefined behavior. There is no conversion from int to float anywhere because the arguments are passed as they are (after promotion) and inside the printf the function incorrectly treats its first argument as float. printf does not know and can't know that the real type of the argument is int.
Variadic arguments undergo some promotions of their own. Of interest for your question unsigned char is promoted to int or unsigned int (I am not sure tbo). As such there is no way for a variadic parameter to actually be of type unsigned char. So hhu while is indeed the specifier for unsigned char it will actually expect an unsigned int (int), which is what you pass to it.
So afaik the code is safe because of the two integer promotions caused by unary minus and passing variadic arguments. I am not 100% sure though. Integer promotions are weird and complicated.
Do I understand the standard correctly that this program cause UB:
#include <stdio.h>
int main(void)
{
char a = 'A';
printf("%c\n", a);
return 0;
}
When it is executed on a system where sizeof(int)==1 && CHAR_MIN==0?
Because if a is unsigned and has the same size (1) as an int, it will be promoted to an unsigned int [1] (2), and not to an int, since a int can not represent all values of a char. The format specifier "%c" expects an int [2] and using the wrong signedness in printf() causes UB [3].
Relevant quotes from ISO/IEC 9899 for C99
[1] Promotion to int according to C99 6.3.1.1:2:
If an int can represent all values of the original type, the value is
converted to an int; otherwise, it is converted to an unsigned int.
These are called the integer promotions. All other types are
unchanged by the integer promotions.
[2] The format specifier "%c" expects an int argument, C99 7.19.6.1:8 c:
If no l length modifier is present, the int argument is converted to
an unsigned char, and the resulting character is written.
[3] Using the wrong type in fprintf() (3), including wrong signedness, causes UB according to C99 7.19.6.1:9:
... If any argument is not the correct type for the corresponding
conversion specification, the behavior is undefined.
The exception for same type with different signedness is given for the va_arg macro but not for printf() and there is no requirement that printf() uses va_arg (4).
Footnotes:
(marked with (n))
This implies INT_MAX==SCHAR_MAX, because char has no padding.
See also this question: Is unsigned char always promoted to int?
The same rules are applied to printf(), see C99 7.19.6.3:2
See also this question: Does printf("%x",1) invoke undefined behavior?
A program can have undefined behavior or not depending on the characteristics of the implementation.
For example, a program that executes
int x = 32767;
x++;
(and is otherwise well defined) has well defined behavior on an implementation with INT_MAX > 32767, and undefined behavior otherwise.
Your program:
#include <stdio.h>
int main(void)
{
char a='A';
printf("%c\n",a);
return 0;
}
has well defined behavior for any hosted implementation with INT_MAX >= CHAR_MAX. On any such implementation, the value of 'A' is promoted to int, which is what %c expects.
If INT_MAX < CHAR_MAX (which implies that plain char is unsigned and that CHAR_BIT >= 16), the value of a is promoted to unsigned int. N1570 7.21.6.1p9:
If any argument is not the correct type for the corresponding
conversion specification, the behavior is undefined.
implies that this has undefined behavior.
In practice, (a) such implementations are rare, likely nonexistent (the only existing C implementations I've heard of with CHAR_BIT > 8 are for DSPs and such implementations are likely to be freestanding), and (b) any such implementation would probably be designed to handle such cases gracefully.
TL;DR there is no UB (in my interpretation at any rate).
6.2.5 types
6. For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements.
9. The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same 41)
41) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.
Furthermore
7.16.1.1 The va_arg macro
2 The va_arg macro expands to an expression that has the specified type and the value of the next argument in the call. [...] If there is no actual next argument, or if type is not compatible with the type of the actual next argument (as promoted according to the default argument promotions), the behavior is undefined, except for the following cases:
one type is a signed integer type, the other type is the corresponding unsigned integer type, and the value is representable in both types;
7.21.6.8 The vfprintf function
288) [...] functions vfprintf, vfscanf, vprintf, vscanf, vsnprintf, vsprintf, and vsscanf invoke the va_arg macro [...]
Thus, it stands to reason that an unsigned type is not "an incorrect type for the corresponding (signed) conversion specification", as long as the value is within the range.
This is corroborated by the fact that major compilers do not warn about signed/unsigned format specification mismatch, even though they do warn about other mismatches, even when the corresponding types have the same representation on a given platform (e.g. long and long long).
Do i understand the standard correct that this program cause UB:
#include <stdio.h>
int main(void)
{
char a='A';
printf("%c\n",a);
return 0;
}
When it is executed on a system where sizeof(int)==1 && CHAR_MIN==0?
That would be a plausible interpretation of the standard. However, in the event that an implementation with such a combination of type characteristics were produced for genuine use, I have full confidence that it would provide appropriate support for the %c directive -- as an extension, if one wants to interpret it that way. The example program would then have well-defined behavior with respect to that implementation, whether or not the C standard is interpreted to define that behavior, too. I suppose I account that quality-of-implementation issue as being rolled up in "for genuine use".
#include<stdio.h>
int main() {
long a = 9;
printf("a = %d",a);//output is 9 but with a warning 'expecting long int'
}
Why can't long here be converted to int?
Variadic functions in general and printf family in particular, are odd special cases. They are notorious for their non-existent type safety, so if you pass the wrong type or use the wrong format string, you invoke undefined behavior and anything can happen.
In your case, most likely int and long happen to have the same representation so the program works despite the warning.
In the case of a regular function though, there is a kind of "demotion" taking place if you pass a larger integer type to a function expecting a smaller one. When this happens, you trigger a conversion from the larger type to the smaller, which is well-defined. (The result will however be compiler-specific if you mix types of different signedness.)
Compilers tend to warn against such implicit conversions, so it is better to do the conversion explicitly with a cast.
Because that's the way variadic functions behave in C language. printf is just a function from the standard library and has no special processing. It is declared as
int printf(const char restrict *fmt, ...);
And the standard (n1256 draft for C99) says (emphasize mine):
6.5.2.2 Function calls...
6 If the expression that denotes the called function has a type that does not include a
prototype, the integer promotions are performed on each argument, and arguments that
have type float are promoted to double. These are called the default argument
promotions...
7 ... The ellipsis notation in a function prototype declarator causes
argument type conversion to stop after the last declared parameter. The default argument
promotions are performed on trailing arguments.
That means that on all parameters to printf, float are converted to double and integer promotions occur on integral arguments.
And in 6.3.1.1 Arithmetic operands / Boolean, characters, and integers
2 The following may be used in an expression wherever an int or unsigned int may
be used:
— An object or expression with an integer type whose integer conversion rank is less
than or equal to the rank of int and unsigned int.
— A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type, the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the integer
promotions.48) All other types are unchanged by the integer promotions.
So as long has a rank greater than int, it is left unchanged by an integer promotion, and the format shall be adapted to accept a long:
long a = 9;
printf("a = %ld",a);
Following passes a long, yet printf() expects an int due to "%d". Result: undefined behavior (UB). If int and long are the same size that UB might look like everything is OK, or it may fail. It is UB.
long a = 9;
printf("a = %d",a); // UB
Why can't long here be converted to int?
long can be converted, yet code did not direct that like the below code.
printf("a = %d", (int) a); // OK
Can integer promotion happen in the reverse order ... in variadic functions like printf()?
Integer promotions do not happen in the reverse order unless code explicitly down-casts or assigned to a narrower type.
There are cases where demotion will appear to be true with printf().
The below promotes sc to int as it is passed to printf(). printf() will take that int and due to "%hhd" will convert it to signed char and then print that numeric value.
signed char sc = 1;
printf("a = %hhd", cs); // prints 1
The below passes i to printf() as an int. printf() will take that int and due to "%hhd" will convert it to signed char and then print that numeric value. So in this case it looks like i was demoted.
int i = 0x101;
printf("a = %hhd", i); // prints 1
The printf family of functions provide a series of length modifiers, two of them being hh (denoting a signed char or unsigned char argument promoted to int) and h (denoting a signed short or unsigned short argument promoted to int). Historically, these length modifiers have only been introduced to create symmetry with the length modifiers of scanf and are rarely used for printf.
Here is an excerpt of ISO 9899:2011 §7.21.6.1 “The fprintf function” ¶7:
7 The length modifiers and their meanings are:
hh Specifies that a following d, i, o, u, x, or X conversion specifier applies to a signed char or unsigned char argument (the argument will have been promoted according to the integer promotions, but its value shall be converted to signed char or unsigned char before printing); or that a following n conversion specifier applies to a pointer to a signed char
argument.
h Specifies that a following d, i, o, u, x, or X conversion specifier applies to a short int or unsigned short intargument (the argument will have been promoted according to the integer promotions, but its value shall be converted to short int or unsigned short int before printing); or that a following n conversion specifier applies to a pointer to a short int argument.
...
Ignoring the case of the n conversion specifier, what do these almost identical paragraphs say about the behaviour of h and hh?
In this answer, it is claimed that passing an argument that is outside the range of a signed char, signed short, unsigned char, or unsigned short resp. for a conversion specification with an h or hh length modifier resp. is undefined behaviour, as the argument wasn't converted from type char, short, etc. resp. before.
I claim that the function operates in a well-defined manner for every value of type int and that printf behaves as if the parameter was converted to char, short, etc. resp. before conversion.
One could also claim that invoking the function with an argument that was not of the corresponding type before default argument promotion is undefined behaviour, but this seems abstruse.
Which of these three interpretations of §7.21.6.1¶7 (if at all) is correct?
The standard specifies:
If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
[C2011 7.21.6.1/9]
What is meant by "the correct type", is conceivably open to interpretation, but the most plausible interpretation to me is the type that the conversion specification "applies to" as specified earlier in the same section, and as quoted, in part, in the question. I take the parenthetical comments about argument promotion to be acknowledging the ordinary argument-passing rules, and avoiding any implication of these functions being special cases. I do not take the parenthetic comments as relevant to determining the "correct type" of the argument.
What actually happens if you pass an argument of wider type than is correct for the conversion specification is a different question. I am inclined to believe that the C system is unlikely to be implemented by anybody such that it makes a difference whether a printf() argument is actually a char, or whether it is an int whose value is in the range of char. I assert, however, that it is valid behavior for the compiler to check argument type correspondence with the format, and to reject the program if there is a mismatch (because the required behavior in such a case is explicitly undefined).
On the other hand, I could certainly imagine printf() implementations that actually misbehave (print garbage, corrupt memory, eat your lunch) if the value of an argument is outside the range implied by the corresponding conversion specifier. This also is permissible on account of the behavior being undefined.
Say I want to print unsigned char:
unsigned char x = 12;
which is correct. This:
printf("%d",x);
or this:
printf("%u",x);
?
The thing is elsewhere on SO I encountered such discussion:
-Even with ch changed to unsigned char, the behavior of the code is not defined by the C standard. This is because the unsigned char is promoted to an int (in normal C implementations), so an int is passed to printf for the specifier %u. However, %u expects an unsigned int, so the types do not match, and the C standard does not define the behavior
-Your comment is incorrect. The C11 standard states that the conversion specifier must be of the same type as the function argument itself, not the promoted type. This point is also specifically addressed in the description of the hh length modifier: "the argument will have been promoted according to the integer promotions, but its value shall be converted to signed char or unsigned char before printing"
So which is correct? Any reliable source saying on this matter? (In that sense we should also print unsigned short int with %d because it can be promoted to int?).
The correct one is*:
printf("%d",x);
This is because of default argument promotions as printf() is variadic function. This means that unsigned char value is always promoted to int.
From N1570 (C11 draft) 6.5.2.2/6 Function calls (emphasis mine going forward):
If the expression that denotes the called function has a type that
does not include a prototype, the integer promotions are performed on
each argument, and arguments that have type float are promoted to
double. These are called the default argument promotions.
and 6.5.2.2/7 subclause tells:
The ellipsis notation in a function prototype declarator causes
argument type conversion to stop after the last declared parameter.
The default argument promotions are performed on trailing arguments.
These integer promotions are defined in 6.3.1.1/2 Boolean, characters, and integers:
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the
integer promotions.58) All other types are unchanged by the integer
promotions.
This quote answers your second question of unsigned short (see comment below).
* with exception to more than 8 bits unsigned char (e.g. it might occupy 16 bit), see #chux's answer.
Correct format specifier for unsigned char x = 12 depends on a number of things:
If INT_MAX >= UCHAR_MAX, which is often the case, use "%d". In this case an unsigned char is promoted to int.
printf("%d",x);
Otherwise use "%u" (or "%x", "%o"). In this case an unsigned char is promoted to unsigned.
printf("%u",x);
Up-to-date compilers support the "hh" length modifier, which compensates for this ambiguity. Shouldx get promoted to int or unsigned due to the standard promotions of variadic parameters, printf() converts it to unsigned char before printing.
printf("%hhu",x);
If dealing with an old compiler without "hh" or seeking highly portable code, use explicit casting
printf("%u", (unsigned) x);
The same issue/answer applies to unsigned short, expect INT_MAX >= USHRT_MAX and use "h" instead of "hh".
For cross platform development, I typically bypass the promoting issue by using inttypes.h
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/inttypes.h.html
This header (which is in the C99 standard) defines all the printf types for the basic types. So if you want an uint8_t (a syntax which I highly suggest using instead of unsigned char) I would use
#include <inttypes.h>
#include <stdint.h>
uint8_t x;
printf("%" PRIu8 "\n",x);
Both, unsigned char and unsigned short, can always safely be printed with %u. Default argument promotions convert them either to int or to unsigned int. If they are promoted to the latter, everything is fine (the format specifier and the type passed match), otherwise C11 (n1570) 6.5.2.2 p6, first bullet, applies:
one promoted type is a signed integer type, the other promoted type is the corresponding unsigned integer type, and the value is representable in both types;
The standard is quite clear that default argument promotions apply to the variadic arguments of printf, e.g. it's mentioned again for the (mostly useless) h and hh length modifiers (ibid. 7.21.6.1 p7, emph. mine):
hh -- Specifies that a following d, i, o, u, x, or X conversion specifier applies to a signed char or unsigned char argument (the argument will have been promoted according to the integer promotions, but its value shall be converted to signed char or unsigned char before printing); [...]