Given the declaration:
extern foo bar;
And, in another file, the definition:
volatile foo bar = ...
I get an error that the definition and declaration are incompatible, which disappears if I add volatile to the declaration or remove it from the definition.
But that's only if foo is an array type, scalar types get along fine with the inconsistency.
I tried it in three different compilers. Does anyone know a reason for this?
Having mismatched qualifiers(const, volatile, restrict) for either a scalar or array should be undefined behavior.
Declarations that refer to the same object should have compatible types otherwise we have undefined behavior, we can see this from the draft C99 standard section 6.2.7 Compatible type and composite type
All declarations that refer to the same object or function shall have
compatible type; otherwise, the behavior is undefined.
and we can see that a definition is also a declaration from 6.7 Declarations:
A definition of an identifier is a declaration for that identifier
that
and we can see from 6.7.3 Type qualifiers that it means type qualifiers must match:
For two qualified types to be compatible, both shall have the
identically qualified version of a compatible type; the order of type
qualifiers within a list of specifiers or qualifiers does not affect
the specified type.
Strict rules of type compatibility require your declarations to have identical cv-qualifications. I.e. it is not supposed to work even for non-array types. The fact that your compiler allows it to slip through is an implementation-specific quirk of your compiler.
However, one can make an educated guess that the underlying reason for the array-specific behavior is one well-known property of arrays: it is not possible to apply cv-qualifiers to the array itself; any cv-qualifiers applied to array type "fall through" and apply to the individual array elements instead.
For example, this is the reason the following code fails to compile
typedef int A[10];
...
A a;
const A *p = &a;
Note that if A is not an array type, then the code is valid. But of A is an array (as in the above example), the initialization immediately becomes a constraint violation from standard C point of view. The initialization shall not compile. const A * is const int (*)[10], and in C const int (*)[10] is not compatible with int (*)[10].
In your example, the same compatibility logic (or a variation thereof) is probably used by the compiler when matching declarations to definitions, except that you used volatile instead of const. You can probably reproduce the same result with const.
Related
From what I understand, the main reason people separate function declarations and definitions is so that the functions can be used in multiple compilation units. So then I was wondering, what's the point of violating DRY this way, if structures don't have prototypes and would still cause ODR problems across compilation units? I decided to try and define a structure twice using a header across two compilation units, and then combining them, but the code compiled without any errors.
Here is what I did:
main.c:
#include "test.h"
int main() {
return 0;
}
a.c:
#include "test.h"
test.h:
#ifndef TEST_INCLUDED
#define TEST_INCLUDED
struct test {
int a;
};
#endif
Then I ran the following gcc commands.
gcc -c a.c
gcc -c main.c
gcc -o final a.o main.o
Why does the above work and not give an error?
C's one definition rule (C17 6.9p5) applies to the definition of a function or an object (i.e. a variable). struct test { int a; }; does not define any object; rather, it declares the identifier test as a tag of the corresponding struct type (6.7.2.3 p7). This declaration is local to the current translation unit (i.e. source file) and it is perfectly fine to have it in several translation units. For that matter, you can even declare the same identifier as a tag for different types in different source files, or in different scopes, so that struct test is an entirely different type in one file / function / block than another. It would probably be confusing, but legal.
If you actually defined an object in test.h, e.g. struct test my_test = { 42 };, then you would be violating the one definition rule, and the behavior of your program would be undefined. (But that does not necessarily mean you will get an error message; multiple definitions are handled in various different ways by different implementations.)
The key section in the standard is nearly indigestible, but §6.2.7 Compatible type and composite type covers the details, with some forward references:
¶1 Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are described in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers, and in 6.7.6 for declarators.55) Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types; if one member of the pair is declared with an alignment specifier, the other is declared with an equivalent alignment specifier; and if one member of the pair is declared with a name, the other is declared with the same name. For two structures, corresponding members shall be declared in the same order. For two structures or unions, corresponding bit-fields shall have the same widths. For two enumerations, corresponding members shall have the same values.
¶2 All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
¶3 A composite type can be constructed from two types that are compatible; it is a type that is compatible with both of the two types and satisfies the following conditions:
If both types are array types, the following rules are applied:
If one type is an array of known constant size, the composite type is an array of that size.
Otherwise, if one type is a variable length array whose size is specified by an expression that is not evaluated, the behavior is undefined.
Otherwise, if one type is a variable length array whose size is specified, the composite type is a variable length array of that size.
Otherwise, if one type is a variable length array of unspecified size, the composite type is a variable length array of unspecified size.
Otherwise, both types are arrays of unknown size and the composite type is an array of unknown size.
The element type of the composite type is the composite type of the two element types.
If only one type is a function type with a parameter type list (a function prototype), the composite type is a function prototype with the parameter type list.
If both types are function types with parameter type lists, the type of each parameter in the composite parameter type list is the composite type of the corresponding parameters.
These rules apply recursively to the types from which the two types are derived.
¶4 For an identifier with internal or external linkage declared in a scope in which a prior declaration of that identifier is visible,56) if the prior declaration specifies internal or external linkage, the type of the identifier at the later declaration becomes the composite type.
55) Two types need not be identical to be compatible.
56) As specified in 6.2.1, the later declaration might hide the prior declaration.
Emphasis added
The second part of ¶1 covers explicitly the case of structures, unions and enumerations declared in separate translation units. It is crucial to allowing separate compilation. Note footnote 55 too. However, if you use the same header to define a given structure (union, enumeration) in separate translation units, the chances of you not using a compatible type are small. It can be done if there is conditional compilation and the conditions are different in the two translation units, but you usually have to be trying quite hard to run into problems.
I created 2 functions, to read and write to a path, declared as such:
int Read(const char * /*Filename*/, void * /*Ptr*/, size_t /*Size*/), Write(const char * /*Filename*/, const void * /*Ptr*/, size_t /*Size*/);
I created an additional function to that will call one of the above functions with a path
static int IOData(int(*const Func)(const char *, void *, size_t)) {
char Filename[DATA_PATH_LEN];
// Build path
return Func(Filename, &Data, sizeof(Data));
}
However, when Write is passed as a callback to IOData, the compiler raises the following warning
Incompatible pointer types passing 'int (const char *, const void , int)' to parameter of type 'int ()(const char *, void *, int)'
Would casting a function that accepts a const pointer to a function that accepts a non const pointer be undefined behavior?
I noticed that there is an almost identical question but that question uses C++ but this question uses plain C so using templates is not an option
This is not allowed because the types of one of the corresponding parameters is not compatible.
Compatible types are defined in section 6.2.7p1 of the C standard:
Two types have compatible type if their types are the same. Additional
rules for determining whether two types are compatible are described
in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers, and in
6.7.6 for declarators. ...
And section 6.7.3p10 details compatibility of qualified types:
For two qualified types to be compatible, both shall have the
identically qualified version of a compatible type; the order of type
qualifiers within a list of specifiers or qualifiers does not affect
the specified type.
This means that const void * and void * are not compatible.
Compatibility of function types is described in section 6.7.6.3p15:
For two function types to be compatible, both shall specify compatible
return types. Moreover, the parameter type lists, if both are
present, shall agree in the number of parameters and in use of the
ellipsis terminator; corresponding parameters shall have compatible
types. If one type has a parameter type list and the other type is
specified by a function declarator that is not part of a function
definition and that contains an empty identifier list, the parameter
list shall not have an ellipsis terminator and the type of each
parameter shall be compatible with the type that results from the
application of the default argument promotions. If one type has a
parameter type list and the other type is specified by a function
definition that contains a (possibly empty) identifier list, both
shall agree in the number of parameters, and the type of each
prototype parameter shall be compatible with the type that results
from the application of the default argument promotions to the type of
the corresponding identifier. (In the determination of type
compatibility and of a composite type, each parameter declared with
function or array type is taken as having the adjusted type and each
parameter declared with qualified type is taken as having the
unqualified version of its declared type.)
So because one set of corresponding parameters are not compatible, the function types are not compatible.
Finally, section 6.5.2.2p9 regarding the function call operator () describes what happens in this case:
If the function is defined with a type that is not compatible with the
type (of the expression) pointed to by the expression that denotes the
called function, the behavior is undefined.
So calling a function through an incompatible function pointer type triggers undefined behavior and therefore should not be done.
The Standard seeks to classify as Undefined Behavior any action whose behavior might be impractical to define on some plausible implementations. Because classifying an action as UB would in no way impairs the ability of a quality implementation to, as a "conforming language extension", process the action in a useful commonplace fashion when one exists, there is no need to avoid characterizing as UB actions which most implementations would process in the same useful fashion.
An implementation that attempts to statically determine maximum stack usage might plausibly assume that calls to a function pointer with a particular signature will only invoke functions whose address is taken and whose signature matches perfectly. If the Standard were to require that pointers to such functions be interchangeable, that might irredeemably break programs which the static analysis tools had previously been able to accommodate.
There's no reason to expect that quality implementations shouldn't be configurable to treat such function pointers as interchangeable in cases where there doing so would be useful and practical, but the Standard waives jurisdiction over quality-of-implementation issues of usefulness and practicality. Unfortunately, it can be hard to know which implementations should be relied upon to support such constructs, because many implementations which would have no reason not to support such constructs don't regard the fact that they support them as being sufficiently noteworthy as to justify explicit documentation.
First question
I found on cppreference
_Atomic ( type-name ) (since C11)
Use as a type specifier; this designates a new atomic type
_Atomic type-name (2) (since C11)
Use as a type qualifier; this designates the atomic version of type-name. In this role, it may be mixed with const, volatile, and restrict), although unlike other qualifiers, the atomic version of type-name may have a different size, alignment, and object representation.
So does using _Atomic(int) instead of _Atomic int
guarantee it to be the same size as int or not?
Second question
Using a qualifier inside _Atomic
Ex:
_Atomic(volatile int)
Throws an error, but using it like this:
_Atomic(volatile _Atomic(int)*)
Does not; is this standard behaviour?
Last question
I noticed atomic functions (ex: atomic_store, atomic_load, atomic_compare_exchange_weak) work without the passed types being _Atomic types, and I can still manage race conditions with no problem.
Is this standard behaviour? Does it have downsides or lead to any error?
First question:
C11 7.17.6p3:
NOTE The representation of atomic integer types need not have the same size as their corresponding regular types. They should have the same size whenever possible, as it eases effort required to port existing code.
Second question:
C11 6.7.2.4p3:
[Constraints]
3 The type name in an atomic type specifier shall not refer to an array type, a function type, an atomic type, or a qualified type.
volatile int is a qualified type. A shall in a constraints section is violated, therefore the compiler needs to output a diagnostics message. Beyond that, the behaviour of such a construct is undefined.
Third question:
C11 7.17.1.p5:
5 In the following synopses:
An A refers to one of the atomic types.
They expect an _Atomic type. You pass in a non-atomic variable, therefore undefined behaviour.
Why does gcc allow extern declarations of type void? Is this an extension or
standard C? Are there acceptable uses for this?
I am guessing it is an extension, but I don't find it mentioned at:
http://gcc.gnu.org/onlinedocs/gcc-4.3.6/gcc/C-Extensions.html
$ cat extern_void.c
extern void foo; /* ok in gcc 4.3, not ok in Visual Studio 2008 */
void* get_foo_ptr(void) { return &foo; }
$ gcc -c extern_void.c # no compile error
$ gcc --version | head -n 1
gcc (Debian 4.3.2-1.1) 4.3.2
Defining foo as type void is of course a compile error:
$ gcc -c -Dextern= extern_void.c
extern_void.c:1: error: storage size of ‘foo’ isn’t known
For comparison, Visual Studio 2008 gives an error on the extern declaration:
$ cl /c extern_void.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 15.00.21022.08 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
extern_void.c
extern_void.c(1) : error C2182: 'foo' : illegal use of type 'void'
Strangely enough (or perhaps not so strangely...) it looks to me like gcc is correct to accept this.
If this was declared static instead of extern, then it would have internal linkage, and §6.9.2/3 would apply:
If the declaration of an identifier for an object is a tentative definition and has internal
linkage, the declared type shall not be an incomplete type.
If it didn't specify any storage class (extern, in this case), then §6.7/7 would apply:
If an identifier for an object is declared with no linkage, the type for the object shall be complete by the end of its declarator, or by the end of its init-declarator if it has an
initializer; in the case of function arguments (including in prototypes), it is the adjusted type (see 6.7.5.3) that is required to be complete.
I either of these cases, void would not work, because (§6.2.5/19):
The void type [...] is an incomplete type that cannot be completed.
None of those applies, however. That seems to leave only the requirements of §6.7.2/2, which seems to allow a declaration of a name with type void:
At least one type specifier shall be given in the declaration specifiers in each declaration,
and in the specifier-qualifier list in each struct declaration and type name. Each list of
type specifiers shall be one of the following sets (delimited by commas, when there is
more than one set on a line); the type specifiers may occur in any order, possibly
intermixed with the other declaration specifiers.
void
char
signed char
[ ... more types elided]
I'm not sure that's really intentional -- I suspect the void is really intended for things like derived types (e.g., pointer to void) or the return type from a function, but I can't find anything that directly specifies that restriction.
I've found the only legitimate use for declaring
extern void foo;
is when foo is a link symbol (an external symbol defined by the linker) that denotes the address of an object of unspecified type.
This is actually useful because link symbols are often used to communicate the extent of memory; i.e. .text section start address, .text section length, etc.
As such, it is important for the code using these symbols to document their type by casting them to an appropriate value. For instance, if foo is actually the length of a memory region:
uint32_t textLen;
textLen = ( uint32_t )foo;
Or, if foo is the start address of that same memory region:
uint8_t *textStart;
textStart = ( uint8_t * )foo;
The only alternate way to reference a link symbol in "C" that I know of is to declare it as an external array:
extern uint8_t foo[];
I actually prefer the void declaration, as it makes it clear that the linker defined symbol has no intrinsic "type."
GCC (also, LLVM C frontend) is definitely buggy. Both Comeau and MS seems to report errors though.
The OP's snippet has at least two definite UBs and one red-herring:
From N1570
[UB #1] Missing main in hosted environment:
J2. Undefined Behavior
[...] A program in a hosted environment does not define a function named main using one of the specified forms (5.1.2.2.1).
[UB #2] Even if we ignore the above there still remains the issue of taking the address of a void expression which is explicitly forbidden:
6.3.2.1 Lvalues, arrays, and function designators
1 An lvalue is an expression (with an object type other than void) that potentially
designates an object;64)
and:
6.5.3.2 Address and indirection operators
Constraints
1T he operand of the unary & operator shall be either a function
designator, the result of a [] or unary * operator, or an lvalue that
designates an object that is not a bit-field and is not declared with
the register storage-class specifier.
[Note: emphasis on lvalue mine]
Also, there is a section in the standard specifically on void:
6.3.2.2 void
1 The (nonexistent) value of a void expression (an expression that has type void) shall not be used in any way, and
implicit or explicit conversions (except to void) shall not be applied
to such an expression.
A file-scope definition is a primary-expression (6.5). So, is taking the address of the object denoted by foo. BTW, the latter invokes UB. This is thus explicitly ruled out.
What remains to be figured out is if removing the extern qualifier makes the above valid or not:
In our case the, for foo as per §6.2.2/5:
5 [...] If the declaration of an identifier for an object
has file scope and no storage-class specifier, its linkage is
external.
i.e. even if we left out the extern we'd still land in the same problem.
One limitation of C's linker-interaction semantics is that it provides no mechanism for allowing numeric link-time constants. In some projects, it may be necessary for static initializers to include numeric values which are not available at compile time but will be available at link time. On some platforms, this may be accomplished by defining somewhere (e.g. in an assembly-language file) a label whose address, if cast to int, would yield the numeric value of interest. An extern definition can then be used within the C file to make the "address" of that thing available as a compile-time constant.
This approach is very much platform-specific (as would be anything using assembly language), but it makes possible some constructs that would be problematic otherwise. A somewhat nasty aspect of it is that if the label is defined in C as a type like unsigned char[], that will convey the impression that the address may be dereferenced or have arithmetic performed upon it. If a compiler will accept void foo;, then (int)&foo will convert the linker-assigned address for foo to an integer using the same pointer-to-integer semantics as would be applicable with any other `void*.
I don't think I've ever used void for that purpose (I've always used extern unsigned char[]) but would think void would be cleaner if something defined it as being a legitimate extension (nothing in the C standard requires that any ability exist anywhere to create a linker symbol which can be used as anything other than one specific non-void type; on platforms where no means would exist to create a linker identifier which a C program could define as extern void, there would be no need for compilers to allow such syntax).
Is accessing a non-const object through a const declaration allowed by the C standard?
E.g. is the following code guaranteed to compile and output 23 and 42 on a standard-conforming platform?
translation unit A:
int a = 23;
void foo(void) { a = 42; }
translation unit B:
#include <stdio.h>
extern volatile const int a;
void foo(void);
int main(void) {
printf("%i\n", a);
foo();
printf("%i\n", a);
return 0;
}
In the ISO/IEC 9899:1999, I just found (6.7.3, paragraph 5):
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined.
But in the case above, the object is not defined as const (but just declared).
UPDATE
I finally found it in ISO/IEC 9899:1999.
6.2.7, 2
All declarations that refer to the same object or function shall have compatible type;
otherwise, the behavior is undefined.
6.7.3, 9
For two qualified types to be compatible, both shall have the identically qualified
version of a compatible type; [...]
So, it is undefined behaviour.
TU A contains the (only) definition of a. So a really is a non-const object, and it can be accessed as such from a function in A with no problems.
I'm pretty sure that TU B invokes undefined behavior, since its declaration of a doesn't agree with the definition. Best quote I've found so far to support that this is UB is 6.7.5/2:
Each declarator declares one identifier, and asserts that when an
operand of the same form as the declarator appears in an expression,
it designates a function or object with the scope, storage duration,
and type indicated by the declaration specifiers.
[Edit: the questioner has since found the proper reference in the standard, see the question.]
Here, the declaration in B asserts that a has type volatile const int. In fact the object does not have (qualified) type volatile const int, it has (qualified) type int. Violation of semantics is UB.
In practice what will happen is that TU A will be compiled as if a is non-const. TU B will be compiled as if a were a volatile const int, which means it won't cache the value of a at all. Thus, I'd expect it to work provided the linker doesn't notice and object to the mismatched types, because I don't immediately see how TU B could possibly emit code that goes wrong. However, my lack of imagination is not the same as guaranteed behavior.
AFAIK, there's nothing in the standard to say that volatile objects at file scope can't be stored in a completely different memory bank from other objects, that provides different instructions to read them. The implementation would still have to be capable of reading a normal object through, say, a volatile pointer, so suppose for example that the "normal" load instruction works on "special" objects, and it uses that when reading through a pointer to a volatile-qualified type. But if (as an optimization) the implementation emitted the special instruction for special objects, and the special instruction didn't work on normal objects, then boom. And I think that's the programmer's fault, although I confess I only invented this implementation 2 minutes ago so I can't be entirely confident that it conforms.
In the B translation unit, const would only prohibit modifying the a variable within the B translation unit itself.
Modifications of that value from outside (other translation units) will reflect on the value you see in B.
This is more of a linker issue than a language issue. The linker is free to frown upon the differing qualifications of the a symbol (if there is such information in the object files) when merging the compiled translation units.
Note, however, that if it's the other way around (const int a = 23 in A and extern int a in B), you would likely encounter a memory access violation in case of attempting to modify a from B, since a could be placed in a read-only area of the process, usually mapped directly from the .rodata section of the executable.
The declaration that has the initialization is the definition, so your object is indeed not a const qualified object and foo has all the rights to modify it.
In B your are providing access to that object that has the additional const qualification. Since the types (the const qualified version and the non-qualified version) have the same object representation, read access through that identifier is valid.
Your second printf, though, has a problem. Since you didn't qualify your B version of a as volatile you are not guaranteed to see the modification of a. The compiler is allowed to optimize and to reuse the previous value that he might have kept in a register.
Declaring it as const means that the instance is defined as const. You cannot access it from a not-const. Most compilers will not allow it, and the standard says it's not allowed either.
FWIW: In H&S5 is written (Section 4.4.3 Type Qualifiers, page 89):
"When used in a context that requires a value rather than a designator, the qualifiers are eliminated from the type." So the const only has an effect when someone tries to write something into the variable.
In this case, the printf's use a as an rvalue, and the added volatile (unnecessary IMHO) makes the program read the variable anew, so I would say, the program is required to produce the output the OP saw initially, on all platforms/compilers.
I'll look at the Standard, and add it if/when I find anything new.
EDIT: I couldn't find any definite solution to this question in the Standard (I used the latest draft for C1X), since all references to linker behavior concentrate on names being identical. Type qualifiers on external declarations do not seem to be covered.
Maybe we should forward this question to the C Standard Committee.