Can extension cancel the existing standard requirements?

Can extension cancel the existing standard requirements? - c

Follow-up question for Why do conforming implementations behave differently w.r.t. incomplete array types with internal linkage?.
Context: in both gcc and clang (conforming implementations) by default the requirement C11,6.9.2p3 [1] is cancelled, which is positioned as an extension.
Question: can an extension cancel the existing standard requirements while keeping the implementation conforming?
[1] C11, 6.9.2 External object definitions, 3:
If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type.
UPD. Yes. In other words: the standard says: "we do not support this, the diagnostics is required". The extension says: "we do support this (hence, the standard required diagnostics is irrelevant)".

It's not so much that an implementation "cancels" a requirement with an extension, but that extensions add features that the standard doesn't otherwise support. The main requirement is that extensions doesn't alter any strictly conforming programs.
The definition of a conforming implementation is as follows from section 4p6 of the C11 standard:
The two forms of conforming implementation are hosted and
freestanding. A conforming hosted implementation shall accept
any strictly conforming program. A conforming freestanding implementation shall accept any strictly conforming program in which the use of the features specified in the library clause (clause 7) is confined to the contents of the standard headers [ ... omitted for brevity ... ]. A conforming
implementation may have extensions (including additional library
functions), provided they do not alter the behavior of any
strictly conforming program
Where a strictly conforming program is defined in section 4p5:
A strictly conforming program shall use only those features
of the language and library specified in this International
Standard. It shall not produce output dependent on any
unspecified, undefined, or implementation-defined behavior, and
shall not exceed any minimum implementation limit
And a conforming program is defined in section 4p7:
A conforming program is one that is acceptable to a conforming implementation.
So given the case of the program from your prior question:
static int arr[ ];
int main( void )
{
return arr[ 0 ];
}
static int arr[ ] = { 0 };
This is not a strictly conforming program because it violates 6.9.2p3. However some implementations such as gcc allows this as an extension. Supporting such a feature doesn't prevent a similar strictly conforming program such as this
static int arr[1];
int main( void )
{
return arr[ 0 ];
}
static int arr[ ] = { 0 };
From behaving any differently. Therefore an implementation supporting this feature still qualifies as a conforming implementation. This also means that the first program, while not a strictly conforming program, is a conforming program because it will run in a well defined manner on a conforming implementation.

The Standard requires that if a program violates a constraint which is in a constraints section, an implementation must issue at least one diagnostic. There is no requirement as to whether the document be meaningful or have any relation to the constraint violation. An implementation that unconditionally output "Warning: this implementation makes no attempt to enforce constraints its author views as stupid" would suffice. Likewise an implementation that includes a command-line option to output such a message and documents that it may not be conforming unless that option is specified.
Note that even that requirement has a loophole: if a program exceeds an implementation's translation limits, the implementation may behave in any manner whatsoever, without limitation, and without having to issue any sort of diagnostic. Although the Standard requires for each implementation there must exist at least one source program that at least nominally exercises the translation limits given in the Standard without causing the implementation to malfunction, an implementation may impose arbitrary restrictions on how the translation limits interact, e.g. allowing a program to either contain one identifier of up to 63 character, or a larger number of identifiers that are no more than three characters long. There are very few circumstances where anything an otherwise-conforming implementation might do with a particular source text would render it non-conforming.

Related

Have the code examples from K&R ever been conforming?

The C Programming Language by Brian Kernighan and Dennis Ritchie contains a lot of examples such as this famous one (K&R 2nd edition 1.1):
#include <stdio.h>
main()
{
printf("hello, world\n");
}
Here I note the following issues:
No return type.
Writing functions with no return type was allowed in C90 which the second edition of the book claims to follow. These will default to int in C90. It is invalid C in later versions of the language.
No return statement.
A function with a return type and no return statement was not well-defined in C90. Writing main() with no return being equivalent to return 0; was a feature added in C99.
Empty parameter list of main().
This is valid C still (as of C17) but has always been an obsolescent feature even in C90. (Upcoming C23 talks of finally getting rid of K&R style functions.)
My question:
Was any code in K&R 2nd edition ever a conforming program, in any version of the standard?

By definition, any source text or collection thereof which is "accepted" by a Conforming C Implementation is a "Conforming C Program". Because implementations are given broad latitude to extend the language in any way which does not affect the behavior of any Strictly Conforming C Programs, any source text T which would not otherwise be a Conforming C Program could be turned into a Conforming C Program by modifying a Conforming C Implementation so that if it were given program that doesn't match T, it would process it normally, and if fed a copy of T it would behave as though it were fed some other program that it would accept.
While this may seem an absurdly broad definition, it satisfies one of the major goals of the C Standards Committee, which was to ensure that if any existing programs could accomplish a task, the task could be accomplished by a Conforming C Program.
As for whether the programs were Strictly Conforming under C89, that's a bit harder to answer. The Standard says that if execution falls through the end of main() it will return an Undefined Value to the host environment, and imposes no requirements about the consequence of doing so, which would suggest that such an action would invoke Undefined Behavior. On the other hand, the Standard also imposes no requirements upon what happens if a program returns EXIT_SUCCESS, nor what happens if it returns EXIT_FAILURE, nor if it returns some other value. Thus, all such actions could be viewed as invoking Undefined Behavior. On the other hand, viewing things in such fashion would make it impossible for any program which terminates to be Strictly Conforming.
I think the most reasonable way of interpreting the Standard would be to say that a program whose execution falls through the end of main() waives any control it might have had to affect what the execution environment does once it terminates. If all courses of action the host environment could perform after a program exits would be equally acceptable, a program's failure to do anything to influence which course of action is taken would not be a defect.
In considering whether a program that fails to specify a return value, or any program for that matter, is "Strictly Conforming", one cannot merely examine the source text, but must also consider the application requirements. If one needs a program to output the characters x and y once each, in some order, the following would be a strictly conforming program that accomplishes that:
#include <stdio.h>
int outputx(void) { return printf("x"); }
int outputy(void) { return printf("y"); }
int main(void)
{
return outputx() + outputy() && printf("\n") && 0;
}
If, on the other hand, one need a program to output "xy", the above would not be a strictly conforming program for that purpose. Thus, I would say that if the application requirements for some task specify that a program must use its return value to influence the host environment, a program that falls through the end of main would not be a Strictly Conforming C Program to accomplish that task. If, however, such influence over the host environment is not part of the application requirements for a task, then a Strictly Conforming C Program could waive such control.
Citations below:
From N1570 section 4 pararaph 7:
A conforming program is one that is acceptable to a conforming implementation. (*) 5) Strictly conforming programs are intended to be maximally portable among conforming implementations. Conforming programs may depend upon nonportable features of a conforming implementation.
Undefined Behavior is defined in 3.4.3:
behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
From the C99 Rationale, talking about the definition of conformance [emphasis original]:
A strictly conforming program is another term for a maximally portable program. The goal is to give the programmer a fighting chance to make powerful C programs that are also highly portable, without seeming to demean perfectly useful C programs that happen not to be portable, thus the adverb strictly.
The fact that a program exits without setting a return value may make it non-portable, but the Standard deliberately avoids "demeaning" non-portable programs by calling them non-conforming.

No, the programs in the K&R book were never conforming programs 1) (C17 4/7) under any verison of the standard.
In C90 (ISO 9899:1990), the code invoked undefined behavior because of the missing return statement.
In C99 (ISO 9899:1999) and beyond, the code won't compile because of the implicit int.
Sources below.
Regarding implicit int, one major difference in function return types between C90 and latter versions can be found here:
C90 6.7.1 Function definitions
The return type of a function shall be void or an object type other than array.
/--/
If the declarator includes an idenfifier list, the types of the parameters may be declared in a following declaration list. Any parameter that is not declared has type int.
C17 6.9.1 Function definitions
The return type of a function shall be void or a complete object type other than array
type.
/--/
If the
declarator includes an identifier list, the types of the parameters shall be declared in a
following declaration list. In either case, the type of each parameter is adjusted as
described in 6.7.6.3 for a parameter type list; the resulting type shall be a complete object
type.
The main difference being the "complete object type" wording, the definition of complete object type being one of the basic types or a pointer to one (C17 6.2.5). We can conclude that implicit int was allowed in C90 both as the return type or as part of a (non-prototype) parameter list.
Regarding no return statement, this text was always there for general functions:
C90 6.6.6.4
If a return statement without an expression is executed and the value of the function call is used by the caller, the behavior is undefined. Reaching the } that terminates a function is equivalent to executing a return statement without an expression.
C17 6.9.1/12
If the } that terminates a function is reached, and the value of the function call is used by
the caller, the behavior is undefined.
However, main() is a special case and an exception was added in C99:
C99 5.1.2.2.3
If the return type of the main function is a type compatible with int, a return from the
initial call to the main function is equivalent to calling the exit function with the value
returned by the main function as its argument; reaching the } that terminates the
main function returns a value of 0.
Whereas in C90, the equivalent text says:
C90 5.1.2.2.3
A return from the initial call to the main function is equivalent to calling the exit function
with the value returned by the main function as its argument. If the main function executes a
return that specifies no value, the termination status returned to the host environment is undefined.
Regarding empty parameter lists, it has been marked as obsolescent from C90 to C17. See future language directions, for example C17 6.11 (or C90 6.9, identical text):
6.11.6 Function declarators
The use of function declarators with empty parentheses (not prototype-format parameter type
declarators) is an obsolescent feature.
6.11.7 Function definitions
The use of function definitions with separate parameter identifier and declaration lists (not prototypeformat
parameter type and identifier declarators) is an obsolescent feature.
This does however not mean that code using the feature isn't conforming, at least up to ISO 9899:2018. It's simply not recommended practice, and was not recommended practice at the point where K&R 2nd edition was released either.
1) C17 from chp 4:
A conforming implementation may have extensions (including
additional library functions), provided they do not alter the behavior of any strictly
conforming program.
A conforming program is one that is acceptable to a conforming implementation.
A strictly conforming program shall use only those features of the language and library
specified in this International Standard. It shall not produce output dependent on any
unspecified, undefined, or implementation-defined behavior, and shall not exceed any minimum implementation limit.
This means that a conforming program may use features of a conforming implementation that are non-portable, but it may not alter the behavior of a strictly conforming program by for example invoking undefined behavior explicitly listed as such in the standard.

I compiled the program
#include <stdio.h>
main()
{
printf("hello, world\n");
}
under two well-regarded compilers, gcc and clang. Just for fun I added the --pedantic option also. As far as I know both of these compilers would be considered "conforming", and I believe that's one of the things their authors certainly strive for.
Both compilers produced an executable which printed hello, world. Under the definition that
A conforming program is one that is acceptable to a conforming implementation
, I conclude that the program is conforming.
I pass no judgement on the question of whether the program would have been conforming under C89.
Although I have not studied the code examples in K&R2 in some years, I believe that most/all of the rest of them are similarly conforming, despite various pedagogical or other shortcuts which might render them not strictly conforming.

Some quotes:
"The first edition, published February 22, 1978, was the first widely available book on the C programming language. Its version of C is sometimes termed K&R C (after the book's authors), often to distinguish this early version from the later version of C standardized as ANSI C."
(source Wikipedia)
In other words, K&R edition 1 predates any official C standards. At the time the only specification was "The C Reference Manual" by Dennis M. Ritchie.
"In April 1988, the second edition of the book was published, updated to cover the changes to the language resulting from the then-new ANSI C standard, particularly with the inclusion of reference material on standard libraries."
(source Wikipedia)
In other words, K&R edition 2 was "aligned with" the first official ANSI C standard otherwise known as C89.
However, at the time K&R edition 2 was published, C89 was not yet complete. According the the Wikipedia page on ANSI C.
"In 1983, the American National Standards Institute formed a committee, X3J11, to establish a standard specification of C. In 1985, the first Standard Draft was released, sometimes referred to as C85. In 1986, another Draft Standard was released, sometimes referred to as C86. The prerelease Standard C was published in 1988, and sometimes referred to as C88."
(source Wikipedia)
Thus, they may be differences between what K&R says and the ANSI C standard.

What is the rationale for "semantics violation does not require diagnostics"?

Follow-up question for: If "shall / shall not" requirement is violated, then does it matter in which section (e.g. Semantics, Constraints) such requirement is located?.
ISO/IEC 9899:202x (E) working draft— December 11, 2020 N2596, 5.1.1.3 Diagnostics, 1:
A conforming implementation shall produce at least one diagnostic message (identified in an
implementation-defined manner) if a preprocessing translation unit or translation unit contains a
violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined. Diagnostic messages need not be produced in other circumstances.
Consequence: semantics violation does not require diagnostics.
Question: what is the (possible) rationale for "semantics violation does not require diagnostics"?

A possible rationale is given by Rice's theorem : non-trivial semantic properties of programs are undecidable
For example, division by zero is a semantics violation; and you cannot decide, by static analysis alone of the C source code, that it won't happen...
A standard cannot require total detection of such undefined behavior, even if of course some tools (e.g. Frama-C) are sometimes capable of detecting them.
See also the halting problem. You should not expect a C compiler to solve it!

The C99 rationale v5.10 gives this explanation:
5.1.1.3 Diagnostics
By mandating some form of diagnostic message for any program containing a syntax error or
constraint violation, the Standard performs two important services. First, it gives teeth to the
concept of erroneous program, since a conforming implementation must distinguish such a program from a valid one. Second, it severely constrains the nature of extensions permissible to
a conforming implementation.
The Standard says nothing about the nature of the diagnostic message, which could simply be
“syntax error”, with no hint of where the error occurs. (An implementation must, of course,
describe what translator output constitutes a diagnostic message, so that the user can recognize it as such.) The C89 Committee ultimately decided that any diagnostic activity beyond this level is
an issue of quality of implementation, and that market forces would encourage more useful
diagnostics. Nevertheless, the C89 Committee felt that at least some significant class of errors
must be diagnosed, and the class specified should be recognizable by all translators.

This happens because the grammar of the C language is context-sensitive and for all the languages that are defined with context-free or more complex grammars on the Chomsky hierarchy one must do a tradeoff between the semantics of the language and its power.
C designers chose to allow much power for the language and this is why the problem of undecidability is omnipresent in C.
There are languages like Coq that try to cut out the undecidable situations and they restrict the semantics of the recursive functions (they allow only sigma(primitive) recursivity).

The question of whether an implementation provides any useful diagnostics in any particular situation is a Quality of Implementation issue outside the Standard's jurisdiction. If an implementation were to unconditionally output "Warning: this program does not output any useful diagnostics" or even "Warning: water is wet", such output would fully satisfy all of the Standard's requirements with regard to diagnostics even if the implementation didn't output any other diagnostics.
Further, the authors of the Standard characterized as "Undefined Behavior" many actions which they expected would be processed in a meaningful and useful fashion by many if not most implementations. According to the published Rationale document, Undefined Behavior among other things "identifies areas of conforming language extension", since implementations are allowed to specify how they will behave in cases that are not defined by the Standard.
Having implementations issue warnings about constructs which were non-portable, but which they would process in a useful fashion would have been annoying.
Prior to the Standard, some implementations would usefully accept constructs like:
struct foo {
int *p;
char pad [4-sizeof (int*)];
int q,r;
};
for all sizes of pointer up to four bytes (8-byte pointers weren't a thing back then), rather than squawking if pointers were exactly four bytes, but some people on the Committee were opposed to the idea of accepting declarations for zero-sized arrays. Thus, a compromise was reached where compilers would squawk about such things, programmers would ignore the useless warnings, and the useful constructs would remain usable on implementations that supported them.
While there was a vague attempt to distinguish between constructs that should produce warnings that programmers could ignore, versus constructs that might be used so much that warnings would be annoying, the fact that issuance of useful diagnostics was a Quality of Implementation issue outside the Standard's jurisdiction meant there was no real need to worry too much about such distinctions.

Does "strictly conforming program" + no extensions mean "no diagnostics emitted"?

Follow-up question for: clang: <string literal> + <expression returning int> leads to confusing warning: adding 'int' to a string does not append to the string.
Does "strictly conforming program" + no extensions mean "no diagnostics emitted"?
Reason: better understanding of the term "strictly conforming program".

An implementation may generate diagnostics even if a program is conforming.
Section 5.1.1.3p1 of the C standard regarding diagnostics states:
A conforming implementation shall produce at least one diagnostic
message (identified in an implementation-defined manner) if a
preprocessing translation unit or translation unit contains a
violation of any syntax rule or constraint, even if the behavior is
also explicitly specified as undefined or implementation-defined.
Diagnostic messages need not be produced in other
circumstances.9)
The intent is that an implementation should identify the
nature of, and where possible localize, each violation. Of
course, an implementation is free to produce any number of
diagnostics as long as a valid program is still correctly
translated. It may also successfully translate an invalid program
The portion in bold in footnote 9 states that additional diagnostics may be produced.

Does "strictly conforming program" + no extensions == no diagnostics emitted?
No.
The only things for which the language specification requires diagnostics to be emitted are invalid syntax and constraint violations:
A conforming implementation shall produce at least one diagnostic
message (identified in an implementation-defined manner) if a
preprocessing translation unit or translation unit contains a
violation of any syntax rule or constraint, even if the behavior is
also explicitly specified as undefined or implementation-defined.
Diagnostic messages need not be produced in other circumstances.
(C2017, 5.1.1.3/1; emphasis added)
By definition, a strictly conforming program exhibits only valid syntax and does not contain any constraint violations, therefore the specification does not require a conforming implementation to emit any diagnostics when presented with such a program.
HOWEVER, the specification does not forbid implementations to emit diagnostics other than those that are required, and most implementations do, under some circumstances, emit diagnostics that are not required. The specification allows this, as clarified by footnote 9, which says, in part:
Of course, an
implementation is free to produce any number of diagnostics as long as
a valid program is still correctly translated.
Note also that "'strictly conforming program' + no extensions" is redundant. A program that makes use of any language extensions may conform, but it does not strictly conform:
A strictly conforming program shall use only those features of the
language and library specified in this International Standard. It
shall not produce output dependent on any unspecified, undefined,or
implementation-defined behavior, and shall not exceed any minimum
implementation limit.
(C2017 4/5; emphasis added)

Is providing the ability to violate "shall" requirement w/o generation of a diagnostic message a compiler bug / defect or feature?

Context: The C standard does not classify diagnostic messages as "warnings" or "errors".
Question: By treating certain "diagnostic messages" as "warnings" and by giving the ability to disable generation of warnings, certain compiler implementations allow to the end user to violate "shall" requirements of the C standard w/o generation of a diagnostic messages. Is this allowance a compiler bug / defect? If not, then how to correctly interpret this case? As a "compiler feature that allows to violate "shall" requirement w/o generation of a diagnostic message"?
Example:
#pragma warning( disable : 34 )
typedef int T[];
int main()
{
return sizeof(T);
}
$ cl t28.c /Za
<no diagnostic messages, the "shall" requirement [1] is silently violated>
[1] ISO/IEC 9899:1990:
The sizeof operator shall not be applied to an expression that has function type or an incomplete type.
UPD.
If /Za (Disable Language Extensions) is specified, then __STDC__ is defined with definition 1.
According to ANSI Conformance page (https://learn.microsoft.com/en-us/cpp/c-language/ansi-conformance?view=msvc-160):
Microsoft C conforms to the standard for the C language as set forth in the 9899:1990 edition of the ANSI C standard.
However, cl gives to the end user the ability to disable "shall requirement originated" warnings. Is it a compiler bug / defect or feature? Need to to correctly interpret this case.

C 2018 6.10.6 discusses the #pragma directive. Paragraph 1 says:
… causes the implementation to behave in an implementation-defined manner. The behavior might cause translation to fail or cause the translator or the resulting program to behave in a non-conforming manner…
That largely licenses the implementation to do anything it wants, as long as it documents it. If #pragma warning( disable : 34 ) is documented to disable the warning, and that is what it does, then that is conforming.
Note in particular that the #pragma “might … cause the translator … to behave in a non-conforming manner.” So, doing something that is otherwise non-conforming because a pragma told you to is conforming.
(I think the original text should say that the #pragma may cause the translator or program to behave in an otherwise non-conforming manner. Because, as currently written, behaving in this documented non-conforming manner is conforming, not non-conforming.)

"shall" (and "shall not") requirements in the standard come in two distinct kinds: restrictions on the program and restrictions on the implementation.
Restrictions on the implemention are things the implementation must (or must not) do -- these may have mandatory diagnostics associated with them.
Restrictions on the program are in fact freedoms for the implementation -- they are things that -- if the program does them -- cause undefined behavior, so the implementation can do anything with them and still be conforming.
The example you have above "The sizeof operator shall not be applied to an expression that "... is a restriction on the program. So a program that does that is not conforming and an implementation can do anything it wants (including treating it as an extension without any requirement for a flag or pragma) and still be conforming.

Is an implementation that allows you turn off diagnostics required by the standard still conforming?

Consider the following:
typedef int;
int main () { return 0; }
If I compile this with clang with no warning specifications I get
warning: typedef requires a name [-Wmissing-declarations]
typedef int;
That's to be expected; typedef int is illegal per section 6.7 of the C11 standard, and per section 5.1.1.3,
A conforming implementation shall produce at least one diagnostic message if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint.
If I compile this using clang -Wno-missing-declarations, it compiles clean, without any diagnostic messages.
My question:
Does this mark clang as a non-conforming implementation, or is it okay to provide the ability to disable what would otherwise be mandatory diagnostics?

From the draft C11 standard section 4 Conformance we see that it is not strictly conforming:
A strictly conforming program shall use only those features of the
language and library specified in this International Standard.3) It
shall not produce output dependent on any unspecified, undefined, or
implementation-defined behavior, and shall not exceed any minimum
implementation limit.
but it is a conforming implementation since a conforming implementation is allowed to have extensions as long as they don't break a strictly conforming program:
[...]A conforming implementation may have extensions (including
additional library functions), provided they do not alter the behavior
of any strictly conforming program.4)
The C-FAQ says:
[...]There are very few realistic, useful, strictly conforming programs. On the other hand, a merely conforming program can make use of any compiler-specific extension it wants to.