How exactly do variadic functions treat numeric constants? e.g. consider the following code:
myfunc(5, 0, 1, 2, 3, 4);
The function looks like this:
void myfunc(int count, ...)
{
}
Now, in order to iterate over the single arguments with va_arg, I need to know their sizes, e.g. int, short, char, float, etc. But what size should I assume for numeric constants like I use in the code above?
Tests have shown that just assuming int for them seems to work fine so the compiler seems to push them as int even though these constants could also be represented in a single char or short each.
Nevertheless, I'm looking for an explanation for the behaviour I see. What is the standard type in C for passing numeric constants to variadic functions? Is this clearly defined or is it compiler-dependent? Is there a difference between 32-bit and 64-bit architecture?
Thanks!
I like Jonathan Leffler's answer, but I thought I'd pipe up with some technical details, for those who intend to write a portable library or something providing an API with variadic functions, and thus need to delve in to the details.
Variadic parameters are subject to default argument promotions (C11 draft N1570 as PDF; section 6.5.2.2 Function calls, paragraph 6):
.. the integer promotions are performed on each argument, and arguments that
have type float are promoted to double. These are called the default argument promotions.
[If] .. the types of the arguments after promotion are not compatible with those of the parameters after promotion, the behavior is undefined, except for the following cases:
one promoted type is a signed integer type, the other promoted type is the corresponding unsigned integer type, and the value is representable in both types;
both types are pointers to qualified or unqualified versions of a character type or void
Floating-point constants are of type double, unless they are suffixed with f or F (as in 1.0f), in which case they are of type float.
In C99 and C11, integer constants are of type int if they fit in one; long (AKA long int) if they fit in one otherwise; of long long (AKA long long int) otherwise. Since many compilers assume an integer constant without a size suffix is a human error or typo, it is a good practice to always include the suffix if the integer constant is not of type int.
Integer constants can also have a letter suffix to denote their type:
u or U for unsigned int
l or L for long int
lu or ul or LU or UL or lU or Lu or uL or Ul for unsigned long int
ll or LL or Ll or lL for long long int
llu or LLU (or ULL or any of their uppercase or lowercase variants) for unsigned long long int
The integer promotion rules are in section 6.3.1.1.
To summarize the default argument promotion rules for C11 (there are some additions compared to C89 and C99, but no significant changes):
float are promoted to double
All integer types whose values can be represented by an int are promoted to int. (This includes both unsigned and signed char and short, and bit-fields of types _Bool, int, and smaller unsigned int bit-fields.)
All integer types whose values can be represented by an unsigned int (but not an int) are promoted to unsigned int. (This includes unsigned int bit fields that cannot be represented by an int (of CHAR_BIT * sizeof (unsigned int) bits, in other words), and typedef'd aliases of unsigned int, but that's it, I think.)
Integer types at least as large as int are unchanged. This includes types long/long int, long long/long long int, and size_t, for example.
There is one 'gotcha' in the rules that I'd like to point out: "signed to unsigned is okay, unsigned to signed is iffy":
If the argument is promoted to a signed integer type, but the function obtains the value using the corresponding unsigned integer type, the function obtains the correct value using modulo arithmetic.
That is, negative values will be as if they were incremented by (1 + maximum representable value in the unsigned integer type), making them positive.
If the argument is promoted to an unsigned integer type, but the function obtains the value using the corresponding signed integer type, and the value is representable in both, the function obtains the correct value. If the value is not representable in both, the behaviour is implementation-defined.
In practice, almost all architectures do the opposite of above, i.e. the signed integer value obtained matches the unsigned value substracted by (1 + the largest representable value of the unsigned integer type). I've heard that some strange ones may signal integer overflow or something similarly weird, but I have never gotten my mitts on such machines.
The man 3 printf man page (courtesy of the Linux man pages project) is quite informative, if you compare the above rules to printf specifiers. The make_message() example function at the end (C99, C11, or POSIX required for vsnprintf()) should also be interesting.
When you write 1, that is an int constant. There is no other type that the compiler is allowed to use. If there is a non-variadic prototype for the function that demands a different type, the compiler will convert the integer 1 to the appropriate type, but on its own, 1 is an int constant. So, in your example, all 6 arguments are int.
You have to know the types of the arguments somehow before the called variadic function processes them. With the printf() family of functions, the format string tells it what to expect; similarly with the scanf() family of functions.
Note that the default conversions apply to the arguments corresponding to the ellipsis of a variadic function. For example, given:
char c = '\007';
short s = 0xB0hD;
float f = 3.1415927;
a call to:
int variadic_function(const char *, ...);
using:
int rc = variadic_function("c s f", c, s, f);
actually converts both c and s to int, and f to double.
Related
I call the recv() that receive data from socket and print the end of buffer content by hex
char nbuff[BUFSZ];
while ((r_n=recv(sfd,rbuff,B_BUF,MSG_EOF))>-1)
{
printf("r_n:%d eob_p:%x\n",r_n,rbuff[r_n-1]);
if (r_n==0)
{
break;
}
memset(rbuff,0,B_BUF);
}
the result is
r_n:1674 eob_p:3c
r_n:1228 eob_p:76
r_n:2456 eob_p:ffffff81
r_n:1228 eob_p:4b
r_n:1228 eob_p:49
r_n:2456 eob_p:57
r_n:1417 eob_p:ffffff82
I am confused about why the result is 4 bytes.
I create another code to print the file that saved from buff
int main ()
{
char buff[11686];
memset(buff,0,11686);
FILE *in =fopen("web/www.sse.com.cn.html","r");
fread(buff,11686,1,in);
for (int i = 0; i < 11686 ; i++)
{
printf("%x\n",buff[i]);
}
}
the result is
....
buff[11684]:60
buff[11685]:ffffff82
why the char buff 's contents size is 4 bytes buff[11685]:ffffff82
Diagnosis
In the second example, buff is a char buffer and plain char is a signed type on your machine, and you're storing values which are negative in buff, so when they're converted to int in the call to printf(), they are negative integers (of small magnitude), printed in hex.
ISO/IEC 9899:2018
Actually, the links are to an online draft of C11, not C18, in HTML which allows links to the relevant paragraphs in the standard. AFAIK, these details have not changed between C90, C99, C11 and C18 anyway.
The standard says that the plain char type is equivalent to either signed char or unsigned char.
§6.2.5 Types ¶15:
The three types char, signed char, and unsigned char are collectively called the character types. The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char.45)
45) CHAR_MIN, defined in <limits.h>, will have one of the values 0 or SCHAR_MIN, and this can be used to distinguish the two options. Irrespective of the choice made, char is a separate type from the other two and is not compatible with either.
§6.3.1.1 Boolean, characters and integers ¶2,3:
2 The following may be used in an expression wherever an int or unsigned int may be used:
An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.58) All other types are unchanged by the integer promotions.
3 The integer promotions preserve value including sign. As discussed earlier, whether a "plain" char is treated as signed is implementation-defined.
58) The integer promotions are applied only: as part of the usual arithmetic conversions, to certain argument expressions, to the operands of the unary +, -, and ~ operators, and to both operands of the shift operators, as specified by their respective subclasses.
§6.5.2.6 Function calls ¶6,7:
6 If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undefined. If the function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undefined. If the function is defined with a type that does not include a prototype, and the types of the arguments after promotion are not compatible with those of the parameters after promotion, the behavior is undefined, except for the following cases:
one promoted type is a signed integer type, the other promoted type is the corresponding unsigned integer type, and the value is representable in both types;
both types are pointers to qualified or unqualified versions of a character type or void.
7 If the expression that denotes the called function has a type that does include a prototype, the arguments are implicitly converted, as if by assignment, to the types of the corresponding parameters, taking the type of each parameter to be the unqualified version of its declared type. The ellipsis notation in a function prototype declarator causes argument type conversion to stop after the last declared parameter. The default argument promotions are performed on trailing arguments.
Exegesis
Note the last two sentences of §6.5.2.6 ¶7 — when the char values are promoted by the 'integer promotions', they are promoted to a (signed) int, and the negative values remain negative. Since an int has 4 bytes, and all the machines you're likely to have available use two's-complement arithmetic, the most significant 3 bytes of the value will be 0xFF each.
Prescription
To always print 2-digit hex for the characters, use %.2X (or %.2x if you prefer; you can also use either %02X or %02x) and pass either (unsigned char)rbuff[r_n-1] or rbuff[r_n-1] & 0xFF as the argument (using the variables from the first example). Or, using the variables from the second example:
printf("%.2X\n", (unsigned char)buff[i]);
printf("%.2X\n", buff[i] & 0xFF);
I'm trying to understand implicit datatype conversions in C. I thought that I had understood this topic, but yet the following code example is still confusing me.
Specifically, I have read about Usual Arithmetic Conversions and Integer Promotion previously from drafts of the C Standard.
unsigned short int a = 0;
printf("\n%lld", (signed int)a - 1);
I am compiling using GCC.
unsigned short int is 2 bytes.
int is 4 bytes.
When I run this code, I get the following result: 4294967295
I expected the result -1.
This is what I expected to happen:
Typecast takes precedence, and LHS of - becomes signed int.
- operation is carried out. No integer promotion or implicit conversions occur here, as LHS and RHS are already both signed int. The result of the operation is -1 with datatype signed int.
Within printf statement, value -1 is retained within the conversion to long long int, and -1 is displayed as the result.
Can someone please explain where the flaw in my logic is?
It's undefined behaviour due to %lld being the inappropriate format specifier for an int type.
Yes indeed (signed int)a - 1 is an int type with value -1, but the printf call is the undefined part. There's nothing in the C standard to suggest that a conversion to long long occurs.
Within printf statement, value -1 is retained within the conversion to long long int
There's no such conversion taking place. printf (family of functions) is dumb and needs a format string that corresponds to the types of the argument list.
printf does not work like an ordinary function void f (long long int x), which would have forced an implicit conversion to the type of the parameter ("as per assignment"/"lvalue conversion"). This would have given you the expected "sign extension".
Notably, there's a another kind of specialized implicit conversion going on here called the default argument promotions, that only applies to variable argument functions and functions with no prototype.
C17 6.5.2.2/6
If the expression that denotes the called function has a type that does not include a
prototype, the integer promotions are performed on each argument, and arguments that
have type float are promoted to double. These are called the default argument
promotions.
C17 6.5.2.2/7 regarding variable argument functions:
The ellipsis notation in a function prototype declarator causes argument type conversion to stop after the last declared parameter. The default argument
promotions are performed on trailing arguments.
In practice this means:
float passed to printf gets implicitly converted to double during function call.
Small integer types passed to printf get implicitly converted during function call as per integer promotions, most likely ending up as int.
Other types passed to printf do not get implicitly promoted during the function call.
And then the passed and potentially converted argument gets treated internally as if it was the type specified by the conversion specifier. If that one doesn't match the actual type, the code has undefined behavior.
In your case you pass an int, it doesn't get implicitly promoted, but as printf treats it as a long long, you get undefined behavior.
Here you can consider yourself lucky. a is a short int that undergoes usual arithmetic conversions to a `signed int', even despite the cast, so
unsigned short int a = 0;
printf("\n%d", (signed int)a - 1);
and
unsigned short int a = 0;
printf("\n%d", a - 1);
would have the same behaviour, if all values of unsigned short are representable in int (as they are in your case). The result of the conversion is an int. Now, for the variable arguments, the default argument promotions are applied and any integers smaller than an int is converted to int if representable, otherwise unsigned int. But lld expects a signed long long int which is 8 bytes wide. Default argument promotions do not promote int implicitly to long long int.
Now comes the luck part - you did get a wrong value. See, since the behaviour is undefined you could have gotten the value that you're expecting, this time - after all it is completely feasible on a 64-bit processor!
7.16.1.1 2 describes va_arg as following (emphasis mine):
If there is no actual next argument, or if
type is not compatible with the type of the actual next argument (as promoted according
to the default argument promotions), the behavior is undefined, except for the following
cases:
one type is a signed integer type, the other type is the corresponding unsigned integer
type, and the value is representable in both types;
one type is pointer to void and the other is a pointer to a character type.
Now to my understanding and it seems that 6.5.2.2 (function calls) does not contradict me, though I might be wrong, the default promotions are:
char to either int or unsigned (implementation specified)
signed char to int
unsigned char to unsigned
short to int
unsigned short to unsigned
float to double
This is all fine and dandy when you know the exact underlying types passed to the va_list (except for char, which AFAIK is impossible to retrieve portably because its signedness is implementation specified).
It gets more complicated when you're expecting types from <stdint.h> to be passed to your va_list.
int8_t and int16_t, deducting through logical limit observations, are guaranteed to be promoted or already be of type int. However it's very dubious to rely on my original "logical" limit observations, so I'm seeking your (and the standard's) confirmation on this deduction (I may be missing some corner cases I'm not even aware of).
the same holds for uint8_t and uint16_t, except the underlying type is unsigned
int32_t may or may not be promoted to int. It may be larger than , smaller than or exactly the same as int. Same holds for uint32_t but for unsigned. How to portably retrieve int32_t and uint32_t passed to va_list? In other words, how to determine if int32_t (uint32_t) has been promoted to int(unsigned)? In yet other words, how to determine whether I should use va_arg(va, int) or va_arg(va, int32_t) to retrieve int32_t passed to the variadic function without invoking undefined behaviour on any platform?
I believe the same questions are valid for int64_t and uint64_t.
This is a theoretical (standard-only concerned) question, with a presumption that all exact-width types in <stdint.h> are present. I'm not interested in "what's true in practice" type of answers, because I believe I already know them.
EDIT
One idea that I have in mind is to use _Generic to determine the underlying type of int32_t. I'm not sure how exactly would you use it though. I'm looking for better (easier) solutions.
#define IS_INT_OR_PROMOTED(X) _Generic((X)0 + (X)0, int: 1, default: 0)
Usage:
int32_t x = IS_INT_OR_PROMOTED(int32_t) ?
(int32_t)va_arg(list, int) :
va_arg(list, int32_t);
With gcc on my PC the macro returns 1 for int8_t, int16_t and int32_t, and 0 for int64_t.
With gcc-avr (a 16-bit target) the macro returns 1 for int8_t and int16_t, and 0 for int32_t and int64_t.
For long the macro returns 0 regardless of whether sizeof(int)==sizeof(long).
I don't have any targets with 64-bit ints but I don't see why it wouldn't work on such a target.
I'm not sure this will work with truly pathological implementations though Actually I'm pretty sure now it will work with any conforming implementation.
Indeed there is no good way to do this. I consider the canonical answer to be "don't do this". Beyond not passing such types as arguments to variadic functions, avoid even using them as "variables" and only use them as "storage" (in arrays and structs that exist in large quantities). Of course it's easy to make a mistake and pass such an element/member as an argument to your variadic function, so that's not very satisfying.
Your idea with _Generic only works if these types aren't defined with implementation-pecific extended integer types your code is unaware of.
There's an awful but valid approach involving passing the va_list to vsnprintf with the right "PRI*" macro, then parsing the integer from the string, but after doing this the list is in a state where you can't use it again, so if only works for final argument.
Your best bet is probably trying to find a formula for "does this type get promoted by default promotions?" You can easily query whether the the max value of the type exceeds INT_MAX or UINT_MAX but this still doesn't help the formal correctness if there's a spurious extended integer type with same range.
Regarding the #if and <limits.h> solution, I found this (6.2.5.8):
For any two integer types with the same signedness and different integer conversion rank
(see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a
subrange of the values of the other type.
And 6.3.3.1 states (emphasis mine):
Every integer type has an integer conversion rank defined as follows:
No two signed integer types shall have the same rank, even if they have the same
representation.
The rank of a signed integer type shall be greater than the rank of any signed integer
type with less precision.
The rank of long long int shall be greater than the rank of long int, which
shall be greater than the rank of int, which shall be greater than the rank of short
int, which shall be greater than the rank of signed char.
The rank of any unsigned integer type shall equal the rank of the corresponding
signed integer type, if any.
The rank of any standard integer type shall be greater than the rank of any extended
integer type with the same width.
The rank of char shall equal the rank of signed char and unsigned char.
The rank of _Bool shall be less than the rank of all other standard integer types.
The rank of any enumerated type shall equal the rank of the compatible integer type
(see 6.7.2.2).
The rank of any extended signed integer type relative to another extended signed
integer type with the same precision is implementation-defined, but still subject to the
other rules for determining the integer conversion rank.
For all integer types T1, T2, and T3, if T1 has greater rank than T2 and T2 has
greater rank than T3, then T1 has greater rank than T3.
And this is what 6.5.2.2 6 says (emphasis mine):
If the expression that denotes the called function has a type that does not include a
prototype, the integer promotions are performed on each argument, and arguments that
have type float are promoted to double. These are called the default argument
promotions. If the number of arguments does not equal the number of parameters, the
behavior is undefined. If the function is defined with a type that includes a prototype, and
either the prototype ends with an ellipsis (, ...) or the types of the arguments after
promotion are not compatible with the types of the parameters, the behavior is undefined.
If the function is defined with a type that does not include a prototype, and the types of
the arguments after promotion are not compatible with those of the parameters after
promotion, the behavior is undefined, except for the following cases:
one promoted type is a signed integer type, the other promoted type is the
corresponding unsigned integer type, and the value is representable in both types;
both types are pointers to qualified or unqualified versions of a character type or
void
Based on these observations I'm lead to believe that
#if INT32_MAX < INT_MAX
int32_t x = va_arg(va, int);
#else
int32_t x = va_arg(va, int32_t);
This is because if the range of int32_t can't contain the range of int, then the range of int32_t is a subrange of int, which means that the rank of int32_t is lower than that of int, and this means that the integer promotion is performed.
On the other hand, if the range of int32_t can contain the range of int, then the range of int32_t is the range of int or a superset of the range of int, and thus the rank of int32_t is greater or equal than the rank of int, which means that the integer promotion is not performed.
EDIT
Corrected the test, according to the comments.
#if INT32_MAX <= INT_MAX && INT32_MIN >= INT_MIN
int32_t x = va_arg(va, int);
#else
int32_t x = va_arg(va, int32_t);
EDIT 2:
I'm now specifically interested in this case:
int is 32-bit one's complement integer.
int32_t is 32-bit two's complement integer (extended type)
the width (same as precision?) is the same
but because "The rank of any standard integer type shall be greater than the rank of any extended integer type with the same width." the rank of int is higher than that of int32_t
this means the integer promotion from int32_t to int must be performed
even though int can't represent all values in int32_t (specifically, it can't represent INT32_MIN)
What happens? Or am I missing something?
After reading quite a couple of questions on integer promotions, it seems to be the common understanding that integer promotions or only applied to small integer types, such as short int or char.
However, I'm wondering why an unsigned int variable of i. e. value 15 shouldn't be promoted to an int as well. After all, it's conversion rank is equal to the rank of int and unsigned int, as requested by statement (1) in the citation below.
As an int can represent the value 15 without any problems (on all plattforms I know of), it should get converted to an int.
Integer promotions
The following may be used in an expression wherever an int or unsigned
int may be used:
An object or expression with an integer type whose integer conversion rank is less than or equal to
the rank of int and unsigned int.
A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the
integer promotions. All other types are unchanged by the integer
promotions.
However, I'm wondering why an unsigned int variable of i. e. value 15 shouldn't be promoted to an int as well. [...] an int can represent the value 15 without any problems
There are two problems with this statement:
Integer promotions does not mean "promotion to an int"; it means "promotion to either an int or unsigned int". Therefore, "promoting" an unsigned int does not make sense: it is already promoted.
Integer promotion rules do not take into consideration the current value of an expression. The rules are specifically written in a way to talk about all values of a type. Hence, the fact that an int is capable of representing the value 15 is irrelevant, because int is not capable of representing all values of unsigned int.
Usually, it's impossible to say at compile-time what values will variables hold when promotion will actually occur.
Investigating value of variable to choose appropriate type at runtime introduces too much overhead, as well as simply impossible. So, only things compiler has are types.
I think main reason to prefer unsigned types over signed is that unsigned integer overflow is defined, while overflow of signed integer is undefined behavior.
Similar to the question Bitshift and integer promotion?, I have a question about integer promotion when using left bitshifts.
unsigned int test(void)
{
unsigned char value8;
unsigned int result;
value8 = 0x12;
result = value8 << 8;
return result;
}
In this case, will be the value8 first promote to unsiged int or is it implementation specific?
6.5.7 Bitwise shift operators ... 3 Sematics ...
The integer promotions are performed on each of the operands. The type of the result is
that of the promoted left operand. If the value of the right operand is negative or is
greater than or equal to the width of the promoted left operand, the behavior is undefined.
It says that the "The integer promotions are performed on each of the operands.", but what is here the promotion rule?
I assume that it should be convert to int if lesser rank than int, but I can't find it.
I ask this, as one compiler (Renesas nc30wa) doesn't promote to int, so the result is always 0 for my sample.
On this platform, a char is 8 bit wide and int 16 bits.
The phrase "the integer promotions" is a very specific thing, found in (for C99) section 6.3.1.1 Booleans, characters, and integers:
If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.
So assuming your unsigned char can be held in an int, it will be promoted to an int. On those rare platforms where unsigned char is as wide as an int, it will promote to an unsigned int.
This is only changed slightly in C11:
If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.
If a specific compiler doesn't follow this behaviour, then it's not really conforming. However, given that the compiler you listed is for embedded systems, it's not really surprising.
Many are built for specific purposes and conformance is not always high on the list of requirements. There may be compiler flags that will allow it to more closely conform to the standard.
Looking at your particular environment, the M16C Series,R8C Family C Compiler Package V.5.45 C Compiler has, in section 2.1.4 nc30 Command Line Options, subsection f. Generated code modification options:
-fextend_to_int, -fETI: Performs operation after extending char-type data to the int type. Extended according to ANSI standards.
although I suspect -fansi is probably a better choice since it covers a few other things as well.
value8 is promoted to int, assuming the conversion rank of unsigned char is lower than the conversion rank of int (usually the case on most platforms).
The conversion ranks of integers are described in C99 in 6.3.1.1.
Note that some compilers disable the integer promotions rules by default. For example, MicroChip compiler MPLAB C18. Look for ISO conformance in the documentation of your compiler.