Clarification on an example of unions in C11 standard - c

The following example is given in the C11 standard, 6.5.2.3
The following is not a valid fragment (because the union type is not
visible within function f):
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2)
{
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g()
{
union {
struct t1 s1;
struct t2 s2;
} u;
/* ... */
return f(&u.s1, &u.s2);
}
Why does it matter that the union type is visible to the function f?
In reading through the relevant section a couple times, I could not see anything in the containing section disallowing this.

It matters because of 6.5.2.3 paragraph 6 (emphasis added):
One special guarantee is made in order to simplify the use of unions:
if a union contains several structures that share a common initial
sequence (see below), and if the union object currently contains one
of these structures, it is permitted to inspect the common initial
part of any of them anywhere that a declaration of the completed type
of the union is visible. Two structures share a common initial
sequence if corresponding members have compatible types (and, for
bit-fields, the same widths) for a sequence of one or more initial
members.
It's not an error that requires a diagnostic (a syntax error or constraint violation), but the behavior is undefined because the m members of the struct t1 and struct t2 objects occupy the same storage, but because struct t1 and struct t2 are different types the compiler is permitted to assume that they don't -- specifically that changes to p1->m won't affect the value of p2->m. The compiler could, for example, save the value of p1->m in a register on first access, and then not reload it from memory on the second access.

Note: This answer doesn't directly answer your question but I think it is relevant and is too big to go in comments.
I think the example in the code is actually correct. It's true that the union common initial sequence rule doesn't apply; but nor is there any other rule which would make this code incorrect.
The purpose of the common initial sequence rule is to guarantee the same layout of the structs. However that is not even an issue here, as the structs only contain a single int, and structs are not permitted to have initial padding.
Note that , as discussed here, sections in ISO/IEC documents titled Note or Example are "non-normative" which means they do not actually form a part of the specification.
It has been suggested that this code violates the strict aliasing rule. Here is the rule, from C11 6.5/7:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
a type compatible with the effective type of the object,
[...]
In the example, the object being accessed (denoted by p2->m or p1->m) have type int. The lvalue expressions p1->m and p2->m have type int. Since int is compatible with int, there is no violation.
It's true that p2->m means (*p2).m, however this expression does not access *p2. It only accesses the m.
Either of the following would be undefined:
*p1 = *(struct t1 *)p2; // strict aliasing: struct t2 not compatible with struct t1
p2->m = p1->m++; // object modified twice without sequence point

Given the declarations:
union U { int x; } u,*up = &u;
struct S { int x; } s,*sp = &s;
the lvalues u.x, up->x, s.x, and sp->x are all of type int, but any access to any of those lvalues will (at least with the pointers initialized as shown) will also access the stored value of an object of type union U or struct S. Since N1570 6.5p7 only allows objects of those types to be accessed via lvalues whose types are either character types, or other structs or unions that contain objects of type union U and struct S, it would not impose any requirements about the behavior of code that attempts to use any of those lvalues.
I think it's clear that the authors of the Standard intended that compilers allow objects of struct or union types to be accessed using lvalues of member type in at least some circumstances, but not necessarily that they allow arbitrary lvalues of member type to access objects of struct or union types. There is nothing normative to differentiate the circumstances where such accesses should be allowed or disallowed, but there is a footnote to suggest that the purpose of the rule is to indicate when things may or may not alias.
If one interprets the rule as only applying in cases where lvalues are used in ways that alias seemingly-unrelated lvalues of other types, such an interpretation would define the behavior of code like:
struct s1 {int x; float y;};
struct s2 {int x; double y;};
union s1s2 { struct s1 v1; struct s2 v2; };
int get_x(void *p) { return ((struct s1*)p)->x; }
when the latter was passed a struct s1*, struct s2*, or union s1s2* that identifies an object of its type, or the freshly-derived address of either member of union s1s2. In any context where an implementation would see enough to have reason to care about whether operations on the original and derived lvalues would affect each other, it would be able to see the relationship between them.
Note, however, that that such an implementation would not be required to allow for the possibility of aliasing in code like the following:
struct position {double px,py,pz;};
struct velocity {double vx,vy,vz;};
void update_vectors(struct position *pos, struct velocity *vel, int n)
{
for (int i=0; i<n; i++)
{
pos[i].px += vel[i].vx;
pos[i].py += vel[i].vy;
pos[i].pz += vel[i].vz;
}
}
even though the Common Initial Sequence guarantee would seem to allow for that.
There are many differences between the two examples, and thus many indications that a compiler could use to allow for the realistic possibility of the first code is passed a struct s2*, it might accessing a struct s2, without having to allow for the more dubious possibility that operations upon pos[] in the second examine might affect elements of vel[].
Many implementations seeking to usefully support the Common Initial Sequence rule in useful fashion would be able to handle the first even if no union type were declared, and I don't know that the authors of the Standard intended that merely adding a union type declaration should force compilers to allow for the possibility of arbitrary aliasing among common initial sequences of members therein. The most natural intention I can see for mentioning union types would be that compilers which are unable to perceive any of the numerous clues present in the first example could use the presence or absence of any complete union type declaration featuring two types as an indication of whether lvalues of one such type might be used to access another.
Note neither N1570 P6.5p7 nor its predecessors make any effort to describe all cases where quality implementations should behave predictably when given code that uses aggregates. Most such cases are left as Quality of Implementation issues. Since low-quality-but-conforming implementations are allowed to behave nonsensically for almost any reason they see fit, there was no perceived need to complicate the Standard with cases that anyone making a bona fide effort to write a quality implementation would handle whether or not it was required for conformance.

Related

Is that invoking a function with the stack pointer pointing to the type only visable in other function vaild

I'm reading ISO/IEC 9899:2017 recently,following is the source link
https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf
And what makes me confused is the following statements, because the standard said it is not a valid fragment (because the union type is not visible within function f).
But I got the correct return value when I run this code.
#include <stdio.h>
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2){
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g(){
union U{
struct t1 s1;
struct t2 s2;
} u = {-1};
return f(&u.s1, &u.s2);
}
int main(){
int i = g();
printf("%d\n",i);
}
So my question is:
Is that valid parsing a pointer of union's member which is a structure defined in file scope to another function where union type is not visible to?
But I got the correct return value when I run this code.
A given compiler accepting the code and the compiled program exhibiting the expected observable behavior are not reliable indicators of the code being correct with respect to the language specification. Nor, therefore, are they reliable predictors of whether a different compiler will accept the code or produce a program that exhibits the expected observable behavior.
Conforming compilers emit diagnostics about certain classes of errors, but they are not required to diagnose all errors, and there are some classes of errors that could not be detected at compile time even if the compiler wanted to do. Generally speaking, compilation and / or execution of incorrect code produces undefined behavior, which many times does not involve any error messages, and which sometimes is even the behavior that was expected. Bottom line: that a program produces the correct or expected result does not prove that the program is correct.
As for the main question,
what makes me confused is the following statements, because the standard said it is not a valid fragment (because the union type is not visible within function f).
[...]
So my question is: Is that valid [passing] a pointer of union's member which is a structure defined in file scope to another function where union type is not visible to?
You have misunderstood the nature of the problem. It is not inherent in passing a pointer to a member of a union. Rather, it has to do with accessing more than one member of the same union object.
The general rule arises from
The value of at most one of the
members can be stored in a union object at any time
(C17 6.7.2.1/16)
and the so-called strict aliasing rule, paragraph 6.5/7:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
[a type compatible with the objects effective type, plus / minus qualification, or the corresponding signed / unsigned type of one of the above, or]
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
a character type.
The storage for every member of a given union object overlaps the storage of all the others (which is why the union can hold only one at a time), therefore with ...
union U{
struct t1 s1;
struct t2 s2;
} u = {-1};
... &u.s1 and &u.s2 point to the same storage. With the given initialization, its effective type is struct s1, and it is the initial part of a possibly larger block of storage whose effective type is union U.
Structure types with different tags are never compatible with each other, so the strict aliasing rule would be violated by g() accessing that initial value via the pointer &u.s2, except that the specification carves out a special case:
One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible.
(C17 6.5.2.3/6)
This is precisely what the example you're looking at is about. Because struct t1 and struct t2 have a common initial sequence consisting of their respective members m, and because the union object in function g() initially does contain a value for its member s1, of type struct t1, it is permitted in g() to access u.s2.m, including indirectly via &u.s2, even though u.s2 is not the member that currently contains a value.
However, 6.5.2.3/6 does not apply in function f(), because type union U is not visible there. Therefore, although it's fine for f() to access p1->m, it produces UB for it to attempt to access p2->m. This is the claim you inquired about.

C: Is accessing initial member of nested struct using pointer cast to "outer" struct type defined? [duplicate]

This question already has answers here:
Are C-structs with the same members types guaranteed to have the same layout in memory?
(4 answers)
Closed 1 year ago.
I'm trying to understand the so-called "common initial sequence" rule for C aliasing analysis. This question does not concern C++.
Specifically, according to resources (for example the CPython PEP 3123),
[A] value of a struct type may also be accessed through a pointer to the first field. E.g. if a struct starts with an int, the struct * may also be cast to an int *, allowing to write int values into the first field.
(emphasis mine).
My question can be roughly phrased as "does the ability to access a struct by pointer to first-member-type pierce nested structs?" That is, what happens if access is via a pointer whose pointed-to type (let's say type struct A) isn't exactly the same type as that of the first member (let's say type struct B), but that pointed-to type (struct A) has common first initial sequence with struct B, and the "underlying" access is only done to that common initial sequence?
(I'm chiefly interested in structs, but I can imagine this question may also pertain to unions, although I imagine unions come with their own tricky bits w.r.t. aliasing.)
This phrasing may not clear, so I tried to illustrate my intention with the code as follows (also available at godbolt.org, and the code seem to compile just fine with the intended effect):
/* Base object as first member of extension types. */
struct base {
unsigned int flags;
};
/* Types extending the "base" by including it as first member */
struct file_object {
struct base attr;
int index;
unsigned int size;
};
struct socket_object {
struct base attr;
int id;
int type;
int status;
};
/* Another base-type with an additional member, but the first member is
* compatible with that of "struct base" */
struct extended_base {
unsigned int flags;
unsigned int mode;
};
/* A type that derives from extended_base */
struct extended_socket_object {
struct extended_base e_attr; /* Using "extended" base here */
int e_id;
int e_type;
int e_status;
int some_other_field;
};
/* Function intended for structs "deriving from struct base" */
unsigned int set_flag(struct base *objattr, unsigned int flag)
{
objattr->flags |= flag;
return objattr->flags;
}
extern struct file_object *file;
extern struct socket_object *sock;
extern struct extended_socket_object *esock;
void access_files(void)
{
/* Cast to pointer-to-first-member-type and use it */
set_flag((struct base *)file, 1);
set_flag((struct base *)sock, 1);
/* Question: is the following access defined?
* Notice that it's cast to (struct base *), rather than
* (struct extended_base *), although the two structs share the same common
* initial member and it is this member that's actually accessed. */
set_flag((struct base *)esock, 1);
return;
}
This is not safe as you're attempting to access an object of type struct extended_base as though it were an object of type struct base.
However, there are rules that allow access to two structures initial common sequence via a union. From section 6.5.2.3p6 of the C standard:
One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members
So if you change the definition of struct extended_socket_object to this:
struct extended_socket_object {
union u_base {
struct base b_attr;
struct extended_base e_attr;
};
int e_id;
int e_type;
int e_status;
int some_other_field;
};
Then a struct extended_socket_object * may be converted to union u_base * which may in turn be converted to a struct base *. This is allowed as per section 6.7.2.1 p15 and p16:
15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase
in the order in which they are declared. A pointer to a structure
object, suitably converted, points to its initial member (or
if that member is a bit-field, then to the unit in which it
resides), and vice versa. There may be unnamed padding within
a structure object, but not at its beginning.
16 The size of a union is sufficient to contain the largest of its members. The value of at most one of the
members can be stored in a union object at any time. A
pointer to a union object, suitably converted, points to each
of its members (or if a member is a bit-field, then to the
unit in which it resides), and vice versa.
It is then allowed to access b_attr->flags because of the union it resides in via 6.5.2.3p6.
According to the C Standard (6.7.2.1 Structure and union specifiers, paragraph 13):
A pointer to a structure object, suitably converted, points to its
initial member (or if that member is a bit-field, then to the unit in
which it resides), and vice versa.
So, converting esock to struct extended_base * and then converting it to unsigned int * must give us a pointer to the flags field, according to the Standard.
I'm not sure if converting to to struct base * counts as "suitably converted" or not. My guess is that it would work at any machine you will try it on, but I wouldn't recommend it.
I think it would be safest (and also make the code more clear) if you simply keep a member of type struct base inside struct extended_base (instead of the member of type unsigned int). After doing that, you have two options:
When you want to send it to a function, write explicitly: esock->e_attr.base (instead of (struct base *)esock). This is what I would recommend.
You can also write: (struct base *) (struct extended_base *) esock which is guaranteed to work, but I think it is less clear, and also more dangerous (if in the future you will want to add or accidentaly add another member in the beginning of the struct).
After reading up into the standard's text following the other answers (thanks!!) I think I may try to answer my own question (which was a bit misleading to begin with, see below)
As the other answers pointed out, there appear to be two somewhat overlapping concerns in this question -
"common initial sequence" -- in the standard documents this specifically refers to the context of a union having several structs as member and when these member structs share some compatible members beginning from the first. (§6.5.2.3 " Structure and union members", p6 -- Thanks, #dbush!).
My reading: the language spec suggests that, if at the site of access to these "apparently" different structs it is made clear that they actually belong to the same union, and that the access is done through the union, it is permitted; otherwise, it is not.
I think the requirement is meant to work with type-based aliasing rules: if these structs do indeed alias each other, this fact must be made clear at compile time (by involving the union). When the compiler sees pointers to different types of structs, it can't, in the most general case, deduce whether they may have belonged to some union somewhere. In that case, if it invokes type-based alias analysis, the code will be miscompiled. So the standard requires that the union is made visible.
"a pointer (to struct), when suitably converted, points to its initial member" (§6.7.2.1 "Structure and union specifiers", p15) -- this sounds tantalizingly close to 1., but it's less about aliasing than about a) the implementation requirements for struct and b) "suitable conversion" of pointers. (Thanks, #Orielno!)
My reading: the "suitable conversion" appears to mean "see everything else in the standard", that is, no matter if the "conversion" is performed by type cast or assignment (or a series of them), being "suitable" suggests "all constraints must be satisfied at all steps". The "initial-member" rule, I think, simply says that the actual location of the struct is exactly the same as the initial member: there cannot be padding in front of the first member (this is explicitly stated in the same paragraph).
But no matter how we make use of this fact to convert pointers, the code must still be subject to constraints governing conversion, because a pointer is not just a machine representation of some location -- its value still has to be correctly interpreted in the context of types. A counterexample would be a conversion involving an assignment that discards const from the pointed-to type: this violates a constraint and cannot be suitable.
The somewhat misleading thing in my original post was to suggest that rule 2 had something to do with "common initial sequence", where it is not directly related to that concept.
So for my own question, I tend to answer, to my own surprise, "yes, it is valid". The reason is that the pointer conversion by cast in expression (struct base *)esock is "legal in the letter of the law" -- the standard simply says that (§6.5.4 "Cast operators", p3)
Conversions that involve pointers, other than where permitted by the constraints of 6.5.16.1 (note: constraints governing simple assignment), shall be specified by means of an explicit cast.
Since the expression is indeed an explicit cast, in and by itself it doesn't contradict the standard. The "conversion" is "suitable". Further function call to set_flag() correctly dereferences the pointer by virtue of the suitable conversion.
But! Indeed the "common initial sequence" becomes important when we want to improve the code. For example, in #dbush's answer, if we want to "inherit from multiple bases" via union, we must make sure that access to base is done where it's apparent that the struct is a member of the union. Also, as #Orielno pointed out, when the code makes us worry about its validity, perhaps switching to an explicitly safe alternative is better even if the code is valid in the first place.
In the language the C Standard was written to describe, an lvalue of the form ptr->memberName would use ptr's type to select a namespace in which to look up memberName, add the offset of that member to the address in ptr, and then access an object of that member type at that address. Once the address and type of the member were determined, the original structure object would play no further rule in the processing of the expression.
When C99 was being written, there was a desire to avoid requiring that a compiler given something like:
struct position {double x,y,z; };
struct velocity {double dx,dy,dz; };
void update_positions(struct positions *pp, struct velocity *vv, int count)
{
for (int i=0; i<count; i++)
{
positions[i].x += vv->dx;
positions[i].y += vv->dy;
positions[i].z += vv->dz;
}
}
must allow for the possibility that a write to e.g. positions[i].y might affect the object of vv->dy even when there is no evidence of any relationship between any object of type struct position and any object of type struct velocity. The Committee agreed that compilers shouldn't be required to accommodate interactions between different structure types in such cases.
I don't think anyone would have seriously disputed the notion that in situations where storage is accessed using a pointer which is freshly and visibly converted from one structure type to another, a quality compiler should accommodate the possibility that the operation might access a structure of the original type. The question of exactly when an implementation would accommodate such possibilities should depend upon what its customers were expecting to do, and was thus left as a quality-of-implementation issue outside the Standard's jurisdiction. The Standard wouldn't forbid implementations from being willfully blind to even the most obvious cases, but that's because the dumber something would be, the less need there should be to prohibit it.
Unfortunately, the authors of clang and gcc have misinterpreted the Standard's failure to forbid them from being obtusely blind to the possibility that a freshly-type-converted pointer might be used to access the same object as a pointer of the original type, as an invitation to behave in such fashion. When using clang or gcc to process any code which would need to make use of the Common Initial Sequence guarantees, one must use -fno-strict-aliasing. When using optimization without that flag, both clang nor gcc are prone to behave in ways inconsistent with any plausible interpretation of the Standard's intent. Whether one views such behaviors as being a result of a really weird interpretation of the Standard, or simply as bugs, I see no reason to expect that gcc or clang will ever behave meaningfully in such cases.

Pointer aliasing between struct and first member of struct [duplicate]

This question already has an answer here:
Struct Extension in C
(1 answer)
Closed 2 years ago.
Pointer aliasing in C is normally undefined behavior (because of strict aliasing), but C11 standard seems allow aliasing a pointer to struct and a pointer to the first member of the struct
C11 6.7.2.1 (15)...A pointer to a structure object... points to its initial member... and vice versa...
So does the following code contain undefined behavior?
struct Foo {
int x;
int y;
};
// does foe return always 100?
int foe() {
struct Foo foo = { .x = 10, .y = 20 }, *pfoo = &foo;
int *px = (int*)pfoo; *px = 100;
return pfoo->x;
}
This code is correct. All versions of Standard C and C++ allow this , although the wording varies.
There's no strict aliasing issue because you access an object of type int via an lvalue of type int. The strict aliasing rule may apply when the lvalue doing the access has a different type to the object stored at the memory location .
The text you quoted covers that the pointer cast actually points to the int object.
The way the Standard is written, an lvalue of a structure or union type may be used to access an object of member type, but there is no provision that would allow an arbitrary lvalue of struct or union's member type to access an object of the struct or union type. Because it would of course be absurd to say that code couldn't use a struct or union member lvalue (which would of course have that member's type) to access a struct or union, all compilers have supported some common access patterns. Because compilers allow such accesses under different circumstances, however, the Standard treats all support for such accesses as a Quality of Implementation issue rather than trying to specify exactly when such support is required.
The approach most consistent with the Standard's wording, and which would allow the most useful optimizations, while also supporting most code that would need to perform type punning or other techniques, would be to say that for purposes of N1570 6.5p7, a pointer which is visibly derived from a pointer or lvalue of a given type may be used within the context of such derivation to access things that would (for purposes of 6.5p7) be accessible using an lvalue of that type. Under such an approach, given a piece of code like:
struct foo { int index,len; int *dat; };
void test1(struct foo *p)
{
int *pp = &foo->len;
*pp = 4;
}
void test2(struct foo *p, int dat)
{
if (p->index < p->len)
{
p->dat[p->index] = dat;
p->index++;
}
}
should recognize that within test1, an access to *pp may access the struct foo object *p, since pp is visibly formed from p. On the other hand, the compiler would not be required to accommodate within test2 the possibility that an object of type struct foo, nor members thereof such as p->index, might be modified through the pointer p->dat, because nothing within test2 would cause the address of a struct foo or any portion thereof to be stored in p->dat.
Clang and gcc, however, instead opt for a different approach, behaving as though 6.5p7 allows struct members to be accessed via arbitrary pointers of their types, but union members can't be accessed via pointers at all, excluding the pointer arithmetic implied by bracketed array expressions. Given union { uint16_t h[4]; uint32_t w[2];} u; clang and gcc will recognize that an access to u.h[i] might interact with u.w[j], but will not recognize that *(u.h+i) might interact with *(u.w+j) even though the Standard defines the meaning of the former expressions with brackets as being equivalent to the latter forms.
Given that compilers consistently handle all of these constructs usefully when type-based aliasing is disabled. The Standard, however, doesn't impose any requirements even in many common cases, and clang and gcc make no promises about behavior of constructs not mandated by the Standard, even if all versions to date have handled such constructs usefully. Thus, I would not recommend relying upon clang or gcc to usefully process anything that involves accessing storage as different types at different times except when using -fno-strict-aliasing, and their wackiness isn't an issue when using that option, so I'd recommend simply using that option unless or until clang and gcc adopt a better defined abstraction.

Is a redeclaration of an untagged structure a compatible type?

For purposes expressed in this question, we want to do this:
typedef struct { int a; } A;
typedef struct { struct { int a; }; int b; } B;
A *BToA(B *b) { return (A *) b; }
B *AToB(A *a) { return (B *) a; }
The desire is that the casts conform to C 2011 6.7.2.1 15, which says “A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa.”
Since the struct { int a; } inside B does not have a name, let’s call it A'.
“Suitably” is not explicitly defined. I presume that if b is a valid pointer to an object of type A', then (A *) b performs a suitable conversion, and, similarly, if a is a pointer to an A' that is in a B, then (B *) a is a suitable conversion.
So the question is: Is A * a valid pointer to an object of type A'?
Per 6.7.6.1, A * is compatible with A' * if A is compatible with A'.
Per 6.2.7, “Two types have compatible type if their types are the same… Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types; if one member of the pair is declared with an alignment specifier, the other is declared with an equivalent alignment specifier; and if one member of the pair is declared with a name, the other is declared with the same name. For two structures, corresponding members shall be declared in the same order…”
These cannot be the same type by 6.7.2.3 5: “Each declaration of a structure, union, or enumerated type which does not include a tag declares a distinct type.”
Since they are not the same type, are they compatible? The text in 6.2.7 says they are compatible if declared in separate translation units, but these are in the same translation unit.
As you laid out in the question, the standard clearly and unambiguously says that two struct definitions struct { int a; } in the same translation unit declare two incompatible types. Notwithstanding the fact that this might be "weird". Compilers have always followed the standard.
This seems like reasonable behaviour to me: if you happen to have semantically unrelated structs in your project that coincidentally have a member list with the same types, you do want the compiler to reject an accidental assignment between the two.
Re. the code in your question, according to 6.7.2.1/13,
The members of an anonymous structure or union are considered to be members of the containing structure or union.
So I would treat the definition of B as being equivalent to:
typedef struct { int a; int b; } B;
for purposes of further analysis.
I haven't seen anything in the standard that says that both struct are compatible and thus I would say that they are not.
The only thing that could get you a limited compatibility between the structures is the use of an union, as mentioned in 6.7.2.1§6:
One special guarantee is made in order to simplify the use of unions: if a union contains
several structures that share a common initial sequence (see below), and if the union
object currently contains one of these structures, it is permitted to inspect the common
initial part of any of them anywhere that a declaration of the completed type of the union
is visible.
i.e., something like
typedef struct { int a; } A;
typedef struct { union { struct { int a; }; A export; }; int b; } B;
A *BToA(B *b) { return &b->export; }
B *AToB(A *a) { return (B *) a; }
should be safe, but for read accesses only: the standard did not really bother to specify what "inspecting" the common initial sequence means, but seems to use it in opposition to "modify".
There are two situations in which compatibility of structures is relevant:
In deciding whether values or pointers of one type may be coerced into values or pointers of the other, without use of the casting operator and without producing a diagnostic. Note that for this purpose, separately-declared structures are incompatible even if they are structurally identical, but that this kind of compatibility is irrelevant when passing structures or pointers between compilation units.
In deciding whether a value of, or pointer to one type, may be safely given to code which expects the other. Passing untagged structures between compilation units would be impossible if structurally-identical types were not regarded as compatible for this purpose. Compilers used to regard structurally-identical types as compatible for this purpose even within a compilation unit, and there was never any good reason for compilers to do otherwise in cases where one or both types are untagged, but because the Standard doesn't mandate such treatment it has become fashionable for compilers to senselessly weaken the language by blithely assuming that a pointer to one such type won't be used to access members of another.
Unfortunately, when the Standard was written, its authors didn't think it was important to expressly mandate all the obviously-useful things that compilers were already doing, and which sensible compilers would continue to do. The net result is that useful constructs which used to be supported and non-controversial will be unreliable unless otherwise-useful optimizations are disabled.

What does C standard say about a union of two identical types

Is it possible to to have the following assert fail with any compiler on any architecture?
union { int x; int y; } u;
u.x = 19;
assert(u.x == u.y);
C99 Makes a special guarantee for a case when two members of a union are structures that share an initial sequence of fields:
struct X {int a; /* other fields may follow */ };
struct Y {int a; /* other fields may follow */ };
union {X x; Y y;} u;
u.x.a = 19;
assert(u.x.a == u.y.a); // Guaranteed never to fail by 6.5.2.3-5.
6.5.2.3-5 : One special guarantee is made in order to simplify the use of unions: if a union contains
several structures that share a common initial sequence (see below), and if the union
object currently contains one of these structures, it is permitted to inspect the common
initial part of any of them anywhere that a declaration of the complete type of the union is
visible. Two structures share a common initial sequence if corresponding members have
compatible types (and, for bit-fields, the same widths) for a sequence of one or more
initial members.
However, I was unable to find a comparable guarantee for non-structured types inside a union. This may very well be an omission, though: if the standard goes some length to describe what must happen with structured types that are not even the same, it should have clarified the same point for simpler, non-structured types.
The assert in the problem will never fail in an implementation of standard C because accessing u.y after an assignment to u.x is required to reinterpret the bytes of u.x as the type of u.y. Since the types are the same, the reinterpretation produces the same value.
This requirement is noted in C 2011 (N1570) 6.5.2.3 note 95, which indicates it derives from clause 6.2.6, which covers the representations of types. Note 95 says:
If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.
(N1570 is an unofficial draft but is readily available on the net.)
I believe this question is very hard to answer in the manner you seem to expect.
As far as I know, reading one field of a union that is not the one that was most recently wwritten to, is undefined behavior.
Thus, it's impossible to answer with "no", since any compiler writer is free to specifically detect this and make it fail just out of spite, if they feel like it.

Resources