Why don't structures in headers violate ODR across multiple translation units?

Why don't structures in headers violate ODR across multiple translation units? - c

From what I understand, the main reason people separate function declarations and definitions is so that the functions can be used in multiple compilation units. So then I was wondering, what's the point of violating DRY this way, if structures don't have prototypes and would still cause ODR problems across compilation units? I decided to try and define a structure twice using a header across two compilation units, and then combining them, but the code compiled without any errors.
Here is what I did:
main.c:
#include "test.h"
int main() {
return 0;
}
a.c:
#include "test.h"
test.h:
#ifndef TEST_INCLUDED
#define TEST_INCLUDED
struct test {
int a;
};
#endif
Then I ran the following gcc commands.
gcc -c a.c
gcc -c main.c
gcc -o final a.o main.o
Why does the above work and not give an error?

C's one definition rule (C17 6.9p5) applies to the definition of a function or an object (i.e. a variable). struct test { int a; }; does not define any object; rather, it declares the identifier test as a tag of the corresponding struct type (6.7.2.3 p7). This declaration is local to the current translation unit (i.e. source file) and it is perfectly fine to have it in several translation units. For that matter, you can even declare the same identifier as a tag for different types in different source files, or in different scopes, so that struct test is an entirely different type in one file / function / block than another. It would probably be confusing, but legal.
If you actually defined an object in test.h, e.g. struct test my_test = { 42 };, then you would be violating the one definition rule, and the behavior of your program would be undefined. (But that does not necessarily mean you will get an error message; multiple definitions are handled in various different ways by different implementations.)

The key section in the standard is nearly indigestible, but §6.2.7 Compatible type and composite type covers the details, with some forward references:
¶1 Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are described in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers, and in 6.7.6 for declarators.55) Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types; if one member of the pair is declared with an alignment specifier, the other is declared with an equivalent alignment specifier; and if one member of the pair is declared with a name, the other is declared with the same name. For two structures, corresponding members shall be declared in the same order. For two structures or unions, corresponding bit-fields shall have the same widths. For two enumerations, corresponding members shall have the same values.
¶2 All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
¶3 A composite type can be constructed from two types that are compatible; it is a type that is compatible with both of the two types and satisfies the following conditions:
If both types are array types, the following rules are applied:
If one type is an array of known constant size, the composite type is an array of that size.
Otherwise, if one type is a variable length array whose size is specified by an expression that is not evaluated, the behavior is undefined.
Otherwise, if one type is a variable length array whose size is specified, the composite type is a variable length array of that size.
Otherwise, if one type is a variable length array of unspecified size, the composite type is a variable length array of unspecified size.
Otherwise, both types are arrays of unknown size and the composite type is an array of unknown size.
The element type of the composite type is the composite type of the two element types.
If only one type is a function type with a parameter type list (a function prototype), the composite type is a function prototype with the parameter type list.
If both types are function types with parameter type lists, the type of each parameter in the composite parameter type list is the composite type of the corresponding parameters.
These rules apply recursively to the types from which the two types are derived.
¶4 For an identifier with internal or external linkage declared in a scope in which a prior declaration of that identifier is visible,56) if the prior declaration specifies internal or external linkage, the type of the identifier at the later declaration becomes the composite type.
55) Two types need not be identical to be compatible.
56) As specified in 6.2.1, the later declaration might hide the prior declaration.
Emphasis added
The second part of ¶1 covers explicitly the case of structures, unions and enumerations declared in separate translation units. It is crucial to allowing separate compilation. Note footnote 55 too. However, if you use the same header to define a given structure (union, enumeration) in separate translation units, the chances of you not using a compatible type are small. It can be done if there is conditional compilation and the conditions are different in the two translation units, but you usually have to be trying quite hard to run into problems.

Related

What does it mean exactly if two types are "compatible" to each other in C?

In the C standard is stated (emphasize mine):
Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are described in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers, and in 6.7.6 for declarators. 56)
56)Two types need not be identical to be compatible.
Source: C18, §6.2.7/1 - "Compatible type and composite type"
The information I get from these sentences is not much and not very helpful. The cited sections in general also provide no further information about what "compatible" exactly is/means.
I know now, that two types are compatible if they have the same type, but also can be compatible if they don't have the same type/be identical.
One place I found out where two non-identical types are compatible to each other is if I compare one type to a typedefd type of this original type or to any typedefd type of the original type, both types are compatible, as explained in the examples to §6.7.8/4 and /5:
§6.7.8/4:
EXAMPLE 1 After
typedef int MILES, KLICKSP();
typedef struct {doublehi, lo; } range;
the constructions
MILES distance;
extern KLICKSP *metricp;
range x;
range z,*zp;
are all valid declarations. The type of distance is int, that of metricp is "pointer to function with no parameter specification returning int", and that of x and z is the specified structure; zp is a pointer to such a structure. The object distance has a type compatible with any other int object.
and
§6.7.8/5:
EXAMPLE 2 After the declarations
typedef structs1 { int x; } t1, *tp1;
typedef structs2 { int x; } t2, *tp2;
type t1and the type pointed to by tp1 are compatible. Type t1 is also compatible with type structs1, but not compatible with the types structs2, t2, the type pointed to by tp2, or int.
but it only shows one example regarding the typedef, where types can be compatible if not identical.
My questions:
Under which (all) circumstances can two types be compatible if they are not identical exactly? , and
What is a "compatible type" exactly? / What does it mean if two types are compatible to each other?
What specifies "compatibility"?
That is what I am looking for and couldn't found in the standard until yet.
If possible, please refer to sections from standard in the answers.
Additional research:
I discovered that compatibility is not mandatory related to range, representation or behavior:
§6.2.5/15:
The three types char, signed char, and unsigned char are collectively called the character types. The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char.45)
45)CHAR_MIN, defined in <limits.h>, will have one of the values 0 or SCHAR_MIN, and this can be used to distinguish the two options. Irrespective of the choice made, char is a separate type from the other two and is not compatible with either.
Cited sections from the first quote:
The cited sections 6.7.2, 6.7.3 and 6.7.6 do not explain more what a compatible type is, they only mandate rules for specific cases when a type shall be a compatible type.
§6.7.2/4:
Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined, 131) but shall be capable of representing the values of all the members of the enumeration. The enumerated type is incomplete until immediately after the } that terminates the list of enumerator declarations, and complete thereafter.
§6.7.3/11:
For two qualified types to be compatible, both shall have the identically qualified version of a compatible type; the order of type qualifiers within a list of specifiers or qualifiers does not affect the specified type.
§6.7.6.1/2:
For two pointer types to be compatible, both shall be identically qualified and both shall be pointers to compatible types.
§6.7.6.2/6:
For two array types to be compatible, both shall have compatible element types, and if both size specifiers are present, and are integer constant expressions, then both size specifiers shall have the same constant value. If the two array types are used in a context which requires them to be compatible, it is undefined behavior if the two size specifiers evaluate to unequal values.
§6.7.6.3/15:
For two function types to be compatible, both shall specify compatible return types.149) Moreover,the parameter type lists, if both are present, shall agree in the number of parameters and in use of the ellipsis terminator; corresponding parameters shall have compatible types. If one type has a parameter type list and the other type is specified by a function declarator that is not part of a function definition and that contains an empty identifier list, the parameter list shall not have an ellipsis terminator and the type of each parameter shall be compatible with the type that results from the application of the default argument promotions. If one type has a parameter type list and the other type is specified by a function definition that contains a (possibly empty) identifier list,both shall agree in the number of parameters, and the type of each prototype parameter shall be compatible with the type that results from the application of the default argument promotions to the type of the corresponding identifier. (In the determination of type compatibility and of a composite type, each parameter declared with function or array type is taken as having the adjusted type and each parameter declared with qualified type is taken as having the unqualified version of its declared type.)
149)If both function types are "old style", parameter types are not compared.
Related:
Compatible types and structures in C
Is a redeclaration of an untagged structure a compatible type?
Compatible types and argument type qualifiers
compatible types vs. strict aliasing rules
Are these compatible function types in C?
Compatible types and ignoring top-level qualifiers in the C type system

It actually comes from this:
All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
So from that you can see that anything that is allowed by the standard to work here, will be by necessity, compatible. For example the type int [] is distinct from int [10] but they are compatible, because the standard allows you to declare
extern int foo[];
in one file and define
int foo[10];
in another file, and access foo using the external identifier as an array of unknown size - therefore the these types are compatible, but not identical.
It is stated explicitly in C11/18 6.7.6.2p6:
For two array types to be compatible, both shall have compatible element types, and if both size specifiers are present, and are integer constant expressions, then both size specifiers shall have the same constant value. If the two array types are used in a context which requires them to be compatible, it is undefined behavior if the two size specifiers evaluate to unequal values.

Many thanks for your question! I bumped into it while trying to reveal what may stand behind the phrase "Two types have compatible type if their types are the same". You helped me to realize that it only states a sufficient condition for two types to be compatible.
In my understanding, the most important property of the compatibility relation is that it overcomes the scope of a struct/union tag or a typedef name. In particular, C17 directly states that (§6.7.2.3/5):
Two declarations of structure, union, or enumerated types which are in
different scopes or use different tags declare distinct types.
That is, the same struct/union declaration used in two or more files results in different, yet compatible types.

Compatible struct types

The C 11 standard defines struct compatibility as follows (6.2.7):
Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types…
That means I can have 2 files like this:
foo.c:
struct struc {
int x;
};
int foo(struct struc *s)
{
return s->x;
}
main.c:
struct struc {
float x;
};
int foo(struct struc *s);
int main(void)
{
return foo(&(struct struc){1.2f});
}
Smells like undefined behavior (as it is for types like int and float). But if I am understanding the standard correctly (maybe I am misinterpreting the second sentence), this is allowed. If so, what is the rationale behind this? Why not also specify that structs in separate translation units must also be structurally equivalent?

Smells like undefined behavior
Because it is.
But if I am understanding the standard correctly
This doesn't seem to be the case in this particular instance.
this is allowed.
Nope. I do not see (and you do not explain) how the standard language could be interpreted this way.
The standard says
If both are completed anywhere within their respective translation units
This condition holds in your your example.
then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types
This requirement is not satisfied, so the types are not compatible.
Why not also specify that structs in separate translation units must also be structurally equivalent?
The standard specifies exactly that. "[o]ne-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types" is precisely the definition of structural equivalence.

Variable declaration and definition mismatch

I am using a C89 compiler (embedded systems).
I ran into some C code where one translation unit defines a variable as bool varName;, where bool is a typedef of unsigned char. Another translation unit forward declares the variable as follows: extern char varName;.
This is obviously a type mismatch, and is an error. My question is, what exact rule does this violate? My knee-jerk reaction was that it is an ODR violation, but there is a single definition so I'm not confident that this is an ODR violation.

6.2.7p2
All declarations that refer to the same object or function shall have
compatible type; otherwise, the behavior is undefined.
The C89 standard has the same paragraph.
Declarations referfing to the same object is further explained in the paragraph on linkage:
An identifier declared in different scopes or in the same scope more
than once can be made to refer to the same object or function by a
process called linkage . There are three kinds of linkage: external,
internal, and none.
In the set of translation units and libraries that constitutes an
entire program, each instance of a particular identifier with external
linkage denotes the same object or function. Within one translation
unit, each instance of an identifier with internal linkage denotes the
same object or function. Identifiers with no linkage denote unique
entities.
Compatible types essentially means identical types, with some minor caveats (e.g., extern int foo[]; is compatible with extern int foo[3];).

Can I define 2 structures of the same type in different files where I want to pass one structure to another?

I have two header files as mentioned below:
file1.h
typedef struct can_type {
int x;
float y;
} M_can_type;
file2.h
typedef struct can_type {
int x;
float y;
} can_type;
Can I define both structures of the same type in different files given as above where I want to pass one structure to another? Also how to map two different type structures to the same type so that I can pass elements of one structure to other?

Yes, it is legal. Indeed, it is necessary. And it is explicit in the standard, but it is in one of the more turgid and nearly incomprehensible sections of the standard.
ISO/IEC 9899:2011 §6.2.7 Compatible type and composite type
¶1 Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are described in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers, and in 6.7.6 for declarators.55) Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of
corresponding members are declared with compatible types; if one member of the pair is declared with an alignment specifier, the other is declared with an equivalent alignment specifier; and if one member of the pair is declared with a name, the other is declared with the same name. For two structures, corresponding members shall be declared in the same order. For two structures or unions, corresponding bit-fields shall have the same
widths. For two enumerations, corresponding members shall have the same values.
55) Two types need not be identical to be compatible.
The section from 'Moreover' onwards discusses the situation you are asking about.
Note that although you have two different typedef names (M_can_type and can_type) in your two headers, the structures that are defined meet the requirements. Remember, typedef names are only aliases for existing other types (so M_can_type is an alias for struct can_type in files that include file1.h and can_type is an alias for struct can_type in files that include file2.h). Because each header defines the structure type, any given source file can only include (directly or indirectly) one of the two headers. If you tried to include both, you'd get the structure type redefined, and that is not allowed (even in C11, where you can have the same typedef name redefined as long as it defines the same type, but you still can't have two definitions of the structure type at the same scope in a single translation unit).
The most common way of ensuring that the types in the separate translation units are compatible is to use a single header to define the type and to include that header in both translation units. However, if you think about it, the compiler doesn't know or care about whether that's what you do. All that matters to it is that the text it sees after preprocessing identifies the same type.

Why does "volatile" demand declare-define consistency only from arrays?

Given the declaration:
extern foo bar;
And, in another file, the definition:
volatile foo bar = ...
I get an error that the definition and declaration are incompatible, which disappears if I add volatile to the declaration or remove it from the definition.
But that's only if foo is an array type, scalar types get along fine with the inconsistency.
I tried it in three different compilers. Does anyone know a reason for this?

Having mismatched qualifiers(const, volatile, restrict) for either a scalar or array should be undefined behavior.
Declarations that refer to the same object should have compatible types otherwise we have undefined behavior, we can see this from the draft C99 standard section 6.2.7 Compatible type and composite type
All declarations that refer to the same object or function shall have
compatible type; otherwise, the behavior is undefined.
and we can see that a definition is also a declaration from 6.7 Declarations:
A definition of an identifier is a declaration for that identifier
that
and we can see from 6.7.3 Type qualifiers that it means type qualifiers must match:
For two qualified types to be compatible, both shall have the
identically qualified version of a compatible type; the order of type
qualifiers within a list of specifiers or qualifiers does not affect
the specified type.

Strict rules of type compatibility require your declarations to have identical cv-qualifications. I.e. it is not supposed to work even for non-array types. The fact that your compiler allows it to slip through is an implementation-specific quirk of your compiler.
However, one can make an educated guess that the underlying reason for the array-specific behavior is one well-known property of arrays: it is not possible to apply cv-qualifiers to the array itself; any cv-qualifiers applied to array type "fall through" and apply to the individual array elements instead.
For example, this is the reason the following code fails to compile
typedef int A[10];
...
A a;
const A *p = &a;
Note that if A is not an array type, then the code is valid. But of A is an array (as in the above example), the initialization immediately becomes a constraint violation from standard C point of view. The initialization shall not compile. const A * is const int (*)[10], and in C const int (*)[10] is not compatible with int (*)[10].
In your example, the same compatibility logic (or a variation thereof) is probably used by the compiler when matching declarations to definitions, except that you used volatile instead of const. You can probably reproduce the same result with const.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight