The following code gives me a compiler warning
warning C4133: ':' : incompatible types - from 'YTYPE *' to 'XTYPE *'
however, the expession seems OK to me. Any ideas?
struct XTYPE {
int x;
long y;
};
struct YTYPE {
long y;
int x;
};
extern void *getSomething(void);
void Test(void)
{
int b= 0;
struct XTYPE *pX;
struct YTYPE *pY;
void * (*pfFoo)(void);
pfFoo= getSomething;
if (b ? (pX= (*pfFoo)()) // error
: (pY= (*pfFoo)()) )
{
;
}
if (b ? ((pX= (*pfFoo)())!=0) // no error
: ((pY= (*pfFoo)())!=0) )
{
;
}
}
It's a constraint violation, simply put. To begin with, the type of an assignment expression is determined by the left hand side. So your case sees struct XTYPE* and struct YTYPE*.
6.5.16 Assignment operators - p3
An assignment operator stores a value in the object designated by the
left operand. An assignment expression has the value of the left
operand after the assignment,111) but is not an lvalue. The type of an
assignment expression is the type the left operand would have after
lvalue conversion. The side effect of updating the stored value of the
left operand is sequenced after the value computations of the left and
right operands. The evaluations of the operands are unsequenced.
And the types of the operands for a conditional expression must satisfy this constraint:
6.5.15 Conditional operator - p3
One of the following shall hold for the second and third operands:
both operands have arithmetic type;
both operands have the same structure or union type;
both operands have void type;
both operands are pointers to qualified or unqualified versions of compatible types;
one operand is a pointer and the other is a null pointer constant; or
one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void.
Since struct XTYPE* and struct YTYPE* are not pointers to compatible types (the only bullet which may even apply), and are in fact just pointers to unrelated types, your program is ill-formed.
A major point of contention here is that MSVC isn't a conforming C compiler (not C11 anyway). But the above rules haven't changed much since the last C version that MSVC does support, so there you have it.
Although you aren't using it, the compiler has to make a sensible type for the result of the tertiary operation by looking at the two RHS paths.
In the first case you gave it 2 unrelated pointer types X and Y, which it refuses to resolve directly to bool for the if statement. (bool doesn't exist as a specific type in C, so it would need to use some int type)
In the second case you have done this cast to bool, by comparing with 0/NULL/nullptr, though you have switched the logic from if set to if clear.
In c++ we have the concept of inheritance and related types, so the compiler can often find a common type to resolve these sorts of statements. In my experience different compilers have treated this situation differently, but with strict compliance settings it should always fail unless one of the operands is a subtype of the other.
One way I used to fix these issues in the old days was to use "double-pling" !! as a cast to boolean keeping the original sense:
if (b ? !!(pX= (*pfFoo)())
: !!(pY= (*pfFoo)()) )
{
;
}
The point of the double-pling is that it can be read as a single operation with a clear intent, while anything else has to be decoded by the reader.
Related
Lets say we have the following expression:
(x>y) ? printf("type of int") : func_string()
So what do we have here?
x>y -> true -> printf("type of int") -> the type of the expression is int (because printf function is an int type function).
or
x>y -> false -> calls func_string() function which for the purpose of the question, returns a string.
The conclusion is that we have only 2 options for the outcome of this expressions:
an int function (prinf) OR a string function (func_string).
Which means there are 2 possible types for the expressions.
However, the compiler can't really wait until the runtime to get the type of the expression, the compiler can't really compile the code while not being sure if the expression's type is an int or a string, so are we gonna get a compile error if we will try to run this sort of code, with 2 different types for the 2 outcomes of the conditional operator.
The standard C17 6.5.15 specifies that the operands must behave according to this:
Constraints
The first operand shall have scalar type.
One of the following shall hold for the second and third operands:
— both operands have arithmetic type;
— both operands have the same structure or union type;
— both operands have void type;
— both operands are pointers to qualified or unqualified versions of compatible types;
— one operand is a pointer and the other is a null pointer constant; or
— one operand is a pointer to an object type and the other is a pointer to a qualified or
unqualified version of void.
In your example, (x>y) ? printf("type of int") : func_string();, printf returns type int which is an arithmetic type. func_string() supposedly returns a char*, which is a pointer type. This use case doesn't match any of the above listed valid scenarios, since the types are not compatible.
And so the compiler reports that the code isn't valid C. Some examples of compiler diagnostics:
icc:
operand types are incompatible ("int" and "char *")
gcc:
pointer/integer type mismatch in conditional expression
In case the 2nd and 3rd operands had been compatible or at least both arithmetic types (one float and one int etc), then this rule from 6.5.15 applies:
If both the second and third operands have arithmetic type, the result type that would be
determined by the usual arithmetic conversions, were they applied to those two operands,
is the type of the result. If both the operands have structure or union type, the result has
that type. If both operands have void type, the result has void type.
To understand that part, you need to understand the meaning of the usual arithmetic conversions, see Implicit type promotion rules
The bottom line is that ?: is not some glorified if-else replacement, but a rather peculiar operator that comes with lots of special rules. ?: also has very few valid use-cases in real C code, one of the few valid uses for it is certain function-like macros.
Suppose I have the following code:
Suppose I have type x = EXPR;, where type is some type and EXPR is some arithmetic expression.
In what circumstances is the result of evaluating EXPR coerced? When does this coercion happen? In what cases does this coercion result in undefined behaviour?
NOTE: I previously asked about
unsigned a = 60000, b = 60000;
int c = a * b;
where int is 16 bits, but decided to edit it to the more general case, as this is more useful.
Suppose I have type x = EXPR;, where type is some type and EXPR
is some arithmetic expression.
In what circumstances is the result of evaluating EXPR coerced?
As long as this is a [language-lawyer] question, I feel compelled to observe that no form of the verb "coerce" appears in the language standard. The verb used is usually "convert" and occasionally "promote", whether it occurs explicitly (by evaluating a cast expression) or otherwise.
With that said, the standard defines the behavior of the assignment operator subject to the constraint (C2011, 6.5.16/2) that
An assignment operator shall have a modifiable lvalue as its left operand.
and the constraint (C2011, 6.5.16.1/1; summarized) that one of the following holds:
the left and right operands have arithmetic types;
the left and right operands have compatible structure or union types;
the left and right operand types are pointers to compatible types, and the type pointed to by the left has all the qualifiers of that pointed to by the right;
the left operand has an object pointer type, the right is a pointer to void, and the type pointed to by the left has all the qualifiers of that pointed to by the right;
the left operand has pointer type, and the right is a null pointer constant; or
the left operand has a _Bool type, and the right is a pointer.
Where either of those constraints does not hold, the standard does not define any behavior for the assignment operator, so its behavior is undefined. Where they both do hold the standard addresses this particular question pretty directly:
In simple assignment (=), the value of the right operand is converted
to the type of the assignment expression and replaces the value stored
in the object designated by the left operand.
(C2011 6.5.16.1/2)
, where
The type of an assignment expression is the type the left operand would have after lvalue conversion.
(C2011 6.5.16/3)
So, the answer is technically that EXPR is converted automatically in all cases that satisfy the constraint, and all effects of the assignment (including whether any conversion is performed) are undefined otherwise. I say "technically" because the plain wording of the standard makes no exception for the case where the types of the two operands are identical, but you might not actually want to count that.
When
does this coercion happen?
In the abstract-machine sense, it must happen after the value of EXPR is computed, for until then there isn't anything to convert, and before the side effect of the assignment operator is applied (and that must be complete by the sequence point that occurs at the terminating semicolon).
In what cases does this coercion result in
undefined behaviour?
According to the rules for conversions presented in section 6.3 of the Standard, for those type combinations permitted by assignment, UB occurs when
a value of real type is converted to a real floating type where the value is out of range for the target type
a non-finite real floating value is converted to any integer type other than _Bool
a finite real floating value is converted to any integer type other than _Bool that cannot represent the result of truncating it to an integer (even if the target type is unsigned)
a complex value is converted to any real type, and UB arises from conversion of its real part, taken as a value of the corresponding real type, to the destination type
a value of real type is converted to a complex type where UB arises from the conversion of the source value to the target type's corresponding real type
a complex value is converted to another complex type where UB arises from conversion of either the real or the imaginary part, taken as a value of the source type's corresponding real type, to the destination type's corresponding real type
Notable cases that do not produce UB include
conversion of any value of integer type to a signed integer type that cannot represent the value. That's implementation-defined (or an implementation-defined signal is raised), not undefined. That's a significant distinction, but it still leaves this case being a portability issue.
conversion of any value of integer type to an unsigned integer type. All such conversions have well-defined results (but the same does not apply to conversions from other scalar types to unsigned integer types).
Notable cases that do not satisfy the constraint for simple assignment include
the left operand has const-qualified type or otherwise is not a modifiable lvalue
one operand has a pointer type and the other an integer type, except if the integer type is _Bool and it appears on the left; and
both operands have pointer types pointing to non-void, incompatible types.
Note also that the rules for evaluating some operations specify undefinedness rules that are not wholly analogous with the rules for conversions or with each other. For the most part, these cases revolve around operations with operands and / or results of signed integer types.
In the example, variable a is evaluated first in the expression, which will cause both operands to be implicitly converted to unsigned if they are not already, which in this case they are. After this the left operand of the assignment operator is evaluated first, which is type int. This causes implicit casting of both operands to int prior to performing the operation.
Implicit casting is always to the type of the operand that has higher precedence for the specific operation according to the C standard.
Implicit casting is allowable in C, but it is a better practice to always use explicit casting since it makes the sequence explicitly clear to anyone reading the code.
For scalar values, the assignment operator seems to copy the right-hand side value to the left. How does that work for composite data types? Eg, if I have a nested struct
struct inner {
int b;
};
struct outer {
struct inner a;
};
int main() {
struct outer s1 = { .a = {.b=1}};
struct outer s2 = s1;
}
does the assignment recursively deep copy the values?
does the same happen when passing the struct to a function?
By experimenting it seems like it does, but can anyone point to the specification of the behavior?
There is no "recursion"; it copies all the (value) bits of the value. Pointers are not magically followed of course, the assignment operator wouldn't know how to duplicate the pointed-to data.
You can think of
a = b;
as shorthand for
memcpy(&a, &b, sizeof a);
The sizeof is misleading of course, since we know the types are the same on both sides but I don't think __typeof__ helps.
The draft C11 spec says (in 6.5.16.1 Simple assignment, paragraph 2):
In simple assignment (=), the value of the right operand is converted to the
type of the assignment expression and replaces the value stored in the object
designated by the left operand.
does the assignment recursively deep copy the values?
Yes, just as if you would have used memcpy. Pointers are copied, but not what they point at. The term "deep copy" often means: also copy what the pointers point at (for example in a C++ copy constructor).
Except the values of any padding bytes may hold indeterminate values. (Meaning that memcmp on a struct might be unsafe.)
does the same happen when passing the struct to a function?
Yes. See the reference to 6.5.2.2 below.
By experimenting it seems like it does, but can anyone point to the specification of the behavior?
C17 6.5.16:
An assignment operator stores a value in the object designated by the left operand. An
assignment expression has the value of the left operand after the assignment, but is not
an lvalue. The type of an assignment expression is the type the left operand would have
after lvalue conversion.
(Lvalue conversion in this case isn't relevant, since both structs must be of 100% identical and compatible types. Simply put: two structs are compatible if they have exactly the same members.)
C17 6.5.16.1 Simple assignment:
the left operand has an atomic, qualified, or unqualified version of a structure or union
type compatible with the type of the right;
C17 6.5.2.2 Function calls, §7:
If the expression that denotes the called function has a type that does include a prototype,
the arguments are implicitly converted, as if by assignment, ...
I wrote a simple C program and I was expecting that it will fail in compilation but unfortunately it compiles and runs fine in C, but fails in compilation in C++.
Consider below program:
#include <stdio.h>
int main()
{
char *c=333;
int *i=333;
long *l=333;
float *f=333;
double *d=333;
printf("c = %u, c+1 = %u",c,c+1);
return 0;
}
Visit this link: http://ideone.com/vnKZnx
I think that this program definitely can't compile in C++ due to C++'s strong type checking. Why this program compiles in C? It is the fact that compiler shows warnings also. I am using Orwell Dev C++ IDE(gcc 4.8.1 compiler). I also tried same program on other compiler (Borland Turbo C++ 4.5) , saved it by extension .c and on this compiler it failed to compile.
This code is neither legal C nor legal C++.
N1570 §6.7.9/p11:
The initializer for a scalar shall be a single expression, optionally
enclosed in braces. The initial value of the object is that of the
expression (after conversion); the same type constraints and
conversions as for simple assignment apply, taking the type of the
scalar to be the unqualified version of its declared type.
§6.5.16.1/p1 provides that for simple assignment:
One of the following shall hold:
the left operand has atomic, qualified, or unqualified arithmetic type, and the right has arithmetic type;
the left operand has an atomic, qualified, or unqualified version of a structure or union type compatible with the type of the right;
the left operand has atomic, qualified, or unqualified pointer type, and (considering the type the left operand would have after lvalue
conversion) both operands are pointers to qualified or unqualified
versions of compatible types, and the type pointed to by the left has
all the qualifiers of the type pointed to by the right;
the left operand has atomic, qualified, or unqualified pointer type, and (considering the type the left operand would have after lvalue
conversion) one operand is a pointer to an object type, and the other
is a pointer to a qualified or unqualified version of void, and the
type pointed to by the left has all the qualifiers of the type pointed
to by the right;
the left operand is an atomic, qualified, or unqualified pointer, and the right is a null pointer constant; or
the left operand has type atomic, qualified, or unqualified _Bool, and the right is a pointer.
None of which matches a pointer on the left and 333 on the right. §6.5.16.1/p1 is a constraint, and conforming implementations are required to produce a diagnostic upon a constraint violation (§5.1.1.3/p1):
A conforming implementation shall produce at least one diagnostic
message (identified in an implementation-defined manner) if a
preprocessing translation unit or translation unit contains a
violation of any syntax rule or constraint, even if the behavior is
also explicitly specified as undefined or implementation-defined.
It happens that GCC decides to produce a warning instead of an error in C mode and continue to compile it, but it doesn't have to.
C can convert numbers to pointers. char* c = 123 will set c to point to the 123rd byte in memory.
While this is nearly useless and almost certainly an error in desktop programming, in embedded systems it is necessary to interface with the hardware, which may look for values in certain, hardcoded memory addresses.
Ethnically there is nothing wrong with your code. You are initializing the pointers with the integer constant values 333. and then printing the address. The warning it shows is probably because the integer value is type casted to address type.
the problem will start when you would try to dereference the pointers. It will give segmentation fault.
Recently I had code (in C) where I passed the address of an int to a function expecting a pointer to unsigned char. Is this not valid? Is this UB or what?
e.g.,
void f(unsigned char*p)
{
// do something
}
// Call it somewhere
int x = 0; // actually it was uint32_t if it makes difference
f(&x);
I did get a warning though ... Compiled in Xcode
int * and unsigned char * are not considered compatible types, so implicit conversion will issue a diagnostic. However, the standard does allow explicit casting between different pointers, subject to two rules (C11 section 6.3.2.3):
Converting a type "pointer to A" to type "pointer to B" and back to "pointer to A" shall result in the same original pointer. (i.e., if p is of type int *, then (int *)(double *)p will yield p)
Converting any pointer to a char * will point to the lowest-addressable byte of the object.
So, in your case, an explicit (unsigned char *) cast will yield a conforming program without any undefined behavior.
The cast is required, see C11 (n1570) 6.5.2.2 p.2:
[…] Each argument shall have a type such that its value may be assigned to an object with the unqualified version of the type of its corresponding parameter.
This refers to the rules for assignment, the relevant part is (ibid. 6.5.16.1 p.1)
One of the following shall hold:
[…]
the left operand has atomic, qualified, or unqualified pointer type, and (considering the type the left operand would have after lvalue conversion) both operands are pointers to qualified or unqualified versions of compatible types, and the type pointed to by the left has all the qualifiers of the type pointed to by the right.
[…]
And unsigned char isn’t compatible to int.
These rules both appear in a “constraint” section, where “shall” means that the compiler has to give a “diagnostic message” (cf. C11 5.1.1.3) and may stop compiling (or whatever, everything beyond that diagnostic is, strictly speaking, out of the scope of the C standard). Your code is an example of a constraint violation.
Other examples of constraint violations are calling a (prototyped and non-variadic) function with the wrong number of arguments, using bitwise operators on doubles, or redeclaring an identifier with an incompatible type in the same scope, ibid. 5.1.1.3 p.2:
Example
An implementation shall issue a diagnostic for the translation unit:
char i;
int i;
because in those cases where wording in this International Standard describes the behavior for a construct as being both a constraint error and resulting in undefined behavior, the constraint error shall be diagnosed.
Syntax violations are treated equally.
So, strictly speaking, your program is as invalid as
int foo(int);
int main() {
It's my birthday!
foo(0.5 ^ 42, 12);
}
which a conforming implementation very well may compile, maybe to a program having undefined behavior, as long as it gives at least one diagnostic (e.g. a warning).
For e.g. gcc, a warning is a diagnostic (you can turn syntax and constraint violations into errors with -pedantic-errors).
The term ill-formed may be used to refer to either a syntax or a constraint violation, the C standard doesn't use this term, but cf. C++11 (n3242):
1.3.9
ill-formed program
program that is not well formed
1.3.26
well-formed program
C++ program constructed according to the syntax rules, diagnosable semantic rules, and the One Definition Rule.
The language-lawyer attitude aside, your code will probably always either be not compiled at all (which should be reason enough to do the cast), or show the expected behavior.
C11, §6.5.2.2:
2 Each argument shall have a type such that its value may be assigned to an object with the unqualified version of the type of its corresponding parameter.
§6.5.16.1 describes assignment in terms of a list of constraints, including
the left operand has atomic, qualified, or unqualified pointer type, and (considering the type the left operand would have after lvalue conversion) both operands are pointers to qualified or unqualified versions of compatible types, and the type pointed to by the left has all the qualifiers of the type pointed to by the right
int and unsigned char are not compatible types, so the program is not well-formed and the Standard doesn't even guarantee that it will compile.
Although some would say "it is undefined behavior according to the standard", here is what happens de-facto (answering by an example):
Safe:
void f(char* p)
{
char r, w = 0;
r = p[0]; // read access
p[0] = w; // write access
}
...
int x = 0;
f((char*)&x); // the casting is just in order to emit the compilation warning
This code is safe as long as you access memory with p[i], where 0 <= i <= sizeof(int)-1.
Unsafe:
void f(int* p)
{
int r, w = 0;
r = p[0]; // read access
p[0] = w; // write access
}
...
char x[sizeof(int)] = {0};
f((int*)&x); // the casting is just in order to emit the compilation warning
This code is unsafe because although the allocated variable is large enough to accommodate an int, its address in memory is not necessarily a multiple of sizeof(int). As a result, unless the compiler (as well as the underlying HW architecture) supports unaligned load/store operations, a memory access violation will occur during runtime if the address of this variable in memory is indeed not properly aligned.