IOCCC 1987/westley.c - lvalue issues with GCC [duplicate] - c

This question already has answers here:
1998 vintage C code now fails to compile under gcc
(2 answers)
Closed 2 years ago.
This line-palindromic entry from the 1987 IOCCC:
https://www.ioccc.org/years.html#1987_westley
...is causing TCC 0.9.27 no issues during default compilation and works as intended.
However, GCC 9.3.0, even in -std=c89 mode, complains that the following instances of (int) (tni) are not lvalues:
for (; (int) (tni);)
(int) (tni) = reviled;
^
(lvalue required as left operand of assignment)
...
for ((int) (tni)++, ++reviled; reviled * *deliver; deliver++, ++(int) (tni))
^~
(lvalue required as increment operand)
(code beautified for better context)
My current thoughts:
In the = case, I suspect that the use of (int) (tni) as a condition in the for loop is disqualifying it as a lvalue, but I am not sure.
In the ++ case, I can see later in that code how its palindromic nature forces the author to use a -- operator between (int) and (tni) which is not considered as an issue. So GCC requires the ++ operator just before the variable, not before its casting, but hints at this requirement with a lvalue complaint.
Is there a definitive answer to these GCC complaints? Is TCC too lax in letting these off the hook?
Thanks in advance!
EDIT: I was kindly pointed towards a similar question which answers the casting issue here - please see my comment below for the solution!

TCC is not a conforming C implementation as is well known - TCC tries to be small and fast compiler that attempts to compile correct C code, and it often does not produce diagnostics that would be required by the standard. And as is known even more widely is that the first C standard came into being in 1989, and most widely known is that year 1987 preceded 1989.
C11 6.5.4p5:
Preceding an expression by a parenthesized type name converts the value of the expression to the named type. This construction is called a cast. 104) A cast that specifies no conversion has no effect on the type or value of an expression.
The footnote 104 notes that:
A cast does not yield an lvalue. Thus, a cast to a qualified type has the same effect as a cast to the unqualified version of the type.
For assignment operator, 6.5.16p2 says:
An assignment operator shall have a modifiable lvalue as its left operand.
6.5.16p2 is in constraint section, so violations must be diagnosed.

Related

Why are const qualified variables accepted as initializers on gcc?

When compiling this code in latest verson of gcc (or clang) with -std=c17 -pedantic-errors -Wall -Wextra
static const int y = 1;
static int x = y;
then I get no compiler diagnostic message even though I'm fairly sure that this is not valid C but a constraint violation. We can prove that it is non-conforming by taking look at C17 6.7.9/4:
Constraints
...
All the expressions in an initializer for an object that has static or thread storage duration shall be constant expressions or string literals.
Then the definition about constant expressions, in this case an integer constant expression (6.6):
An integer constant expression shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, _Alignof expressions, and floating constants that are the immediate operands of casts.
And then finally the definition about integer constants (6.4.4.1/2):
An integer constant begins with a digit, but has no period or exponent part. It may have a prefix that specifies its base and a suffix that specifies its type.
Thus a const int variable is not an integer constant nor is it an integer constant expression. And therefore not a valid initializer. This has been discussed before (for example here) and I think it's already established that this is non-conforming. However, my question is:
Why did gcc chose to be non-compliant even in strict mode?
clang has apparently always been non-compliant, but gcc changed from being compliant in version 7.3 to non-compliant in version 8.0 and above. gcc 7.3 and earlier gives "error: initializer element is not constant" even in default mode without -pedantic-errors.
Some sort of active, conscious decision seems to have been made regarding this message. Why was it removed entirely in gcc and why didn't they leave it as it was when compiling in strict mode -std=c17 -pedantic-errors?
Why did gcc chose to be non-compliant even in strict mode?
Inasmuch as the question as posed is directed to the motivation of the developers, the only information we have to go on as third parties comes from the public development artifacts, such as GCC bugzilla, repository commit messages, and actual code. As was pointed out in comments, the matter is discussed in the Bugzilla comment thread associated with the change.
The Bugzilla discussion appears to show that the developers considered the standard's requirements in this area, albeit in a somewhat perfunctory manner. See in particular comments 9 and 10. They raise paragraph 6.6/10 of the language specification:
An implementation may accept other forms of constant expressions.
They do not subject this to any particular scrutiny, and I read the comments more as seeking a justification for the change than as a thoughtful inquiry into GCC conformance considerations.
Thus, they made the change because they wanted to implement the feature request, and they found sufficient (for them) justification in the language of the standard to consider the altered behavior to be consistent with language constraints, therefore not requiring a diagnostic.
There is also an implied question of whether recent GCC's silent acceptance of the declaration forms presented in fact violates conforming processors' obligation to diagnose constraint violations.
Although it is possible to interpret 6.6/10 as allowing implementations to accept any expressions they choose as conforming to the requirements for any kind of constant expression, that seems fraught. Whether a given piece of code satisfies the language's constraints should not be implementation dependent. Either of these points of interpretation, if accepted, would resolve that problem:
6.6/10 should be interpreted as expressing a specific case of the general rule that a conforming implementation may accept non-conforming code, without implying that doing so entitles the processor to treat the code as conforming.
6.6/10 should be interpreted as permitting processors to interpret more expressions as "constant expressions" than those described in the preceding paragraphs, but that has no bearing on the definitions of the specific kinds of constant expressions defined in those paragraphs ("integer constant expressions" and "arithmetic constant expressions").
Those are not mutually exclusive. I subscribe to the latter, as I have written previously, and I tend to favor the former as well.

Detecting if expression is lvalue or rvalue in C

Is there any way of determining whether a given expression is an lvalue or an rvalue in C? For instance, does there exist a function or macro is_lvalue with the following sort of behaviour:
int four() {
return 4;
}
int a = 4;
/* Returns true */
is_lvalue(a);
/* Returns false */
is_lvalue(four());
I know equivalent functionality exists in C++, but is there any way of doing this in any standard of C? I'm not particularly interested in GCC-specific extensions.
Thank you for your time.
The C standard does not provide any method for detecting whether an expression is an lvalue or not, either by causing some operation to have different values depending on whether an operand is an lvalue or not or by generating a translation-time diagnostic message or error depending on whether an operand is an lvalue or not.
C implementations may of course define an extension that provides this feature.
About the closest one can get in strictly conforming C is to attempt to take the address of the expression with the address-of operator &. This will produce a diagnostic message (and, in typical C implementations, an error) if its operand is not an lvalue. However, it will also produce a message for lvalues that are bit-fields or that were declared with register. If these are excluded from the cases of interest, then it may serve to distinguish between lvalues and non-lvalues during program translation.

MISRA C rule 12.2

I am new to MISRA rules concepts. I have a rule 12.2 warning saying
The value of an expression shall be the same under any order of
evaluation that the standard permits (MISRA C 2004)
on the following C code:
PtToStack->Entry[PtToStack->top] = e ;
where PtToStack is pointer to stack, Entry is array in the stack structure and top variable is a field of the stack structure.
e has the same type of Entry.
Could any one help me to understand the warning?
This rule from MISRA-C:2004 (older standard) is concerning the order of evaluation of operands, in expressions where the order is not specified. There are plenty of examples and education material regarding the issue, below rule 12.2.
In your expression, there are no issues with unspecified order of evaluation. Therefore, the warning is incorrectly generated by your tool. Your static analyser is bad, file a bug report with the tool vendor.

technical legality of incompatible pointer assignments

The C11 standard ISO/IEC 9899:2011 (E) states the following constraints for simple assignments in §6.5.16.1/1:
One of the following shall hold:
the left operand has atomic, qualified, or unqualified arithmetic type, and the right has
arithmetic type;
the left operand has an atomic, qualified, or unqualified version of a structure or union
type compatible with the type of the right;
the left operand has atomic, qualified, or unqualified pointer type, and (considering
the type the left operand would have after lvalue conversion) both operands are
pointers to qualified or unqualified versions of compatible types, and the type pointed
to by the left has all the qualifiers of the type pointed to by the right;
the left operand has atomic, qualified, or unqualified pointer type, and (considering
the type the left operand would have after lvalue conversion) one operand is a pointer
to an object type, and the other is a pointer to a qualified or unqualified version of
void, and the type pointed to by the left has all the qualifiers of the type pointed to
by the right;
the left operand is an atomic, qualified, or unqualified pointer, and the right is a null
pointer constant; or
the left operand has type atomic, qualified, or unqualified _Bool, and the right is a pointer.
I am interested in the case in which both sides are pointers to incompatible types different from void. If I understand correctly, this should at the very least invoke UB, as it violates this constraint. One example for incompatible types should be (according to §6.2.7 and §6.7.2) int and double.
Therefore the following program should be in violation:
int main(void) {
int a = 17;
double* p;
p = &a;
(void)p;
}
Both gcc and clang warn about "-Wincompatible-pointer-types", but do not abort compilation (compilation with -std=c11 -Wall -Wextra -pedantic).
Similarly, the following program only leads to a "-Wint-conversion" warning, while compiling just fine.
int main(void) {
int a;
double* p;
p = a;
(void)p;
}
Coming from C++, I expected that either of those test cases would require a cast to compile. Is there any reason why either of the programs would be standards-legal? Or, are there at least significant historic reasons for supporting this code style even when disabling the entertaining GNU C extensions by explicitly using -std=c11 instead of -std=gnu11?
Is there any reason why either of the programs would be standards-legal?
These programs are not "standards-legal". They contain constraint violations and you already quoted the right text from the standard.
The compilers conform to the standard by producing a diagnostic for constraint violation. The standard does not require compilation to abort in the case of a constraint violation or other erroneous program.
It doesn't say in as many words, but the only reasonable conclusion is that any executable generated as a result of a program containing a constraint violation has completely undefined behaviour. (I have seen people try to argue otherwise though).
Speculation follows: C (and C++) are used for many purposes; sometimes people want "high level assembler" for their machine and don't care about portability or standards. Presumably the compiler vendors set the defaults to what they think their target audience would prefer.
The compiler flag (both gcc and clang) to request checks for strict standards conformance and to refuse to compile nonconformant code is -pedantic-errors:
$ gcc -std=c11 -pedantic-errors x.c
x.c: In function ‘main’:
x.c:3:15: error: initialization from incompatible pointer type [-Wincompatible-pointer-types]
double* p = &a;
^
Clang:
$ clang -std=c11 -pedantic-errors x.c
x.c:3:11: error: incompatible pointer types initializing 'double *' with an
expression of type 'int *' [-Werror,-Wincompatible-pointer-types]
double* p = &a;
^ ~~
1 error generated.
A significant proportion (to say the least) of typical C code in the wild is nonconformant, so -pedantic-errors would cause most C programs and libraries to fail to compile.
Your code example and your citation of the standard does not match. The example is initialization and 6.5.16 talks about assignment.
Confusingly the matching-type requirement is in a constraint section 6.5.16 for assignment, but "only" in the semantics section (6.7.9) for for initialization. So the compilers have the "right" not to issue a diagnostic for initialization.
In C, constraint violations only require "diagnostics", the compiler may well continue compilation, but there is no guarantee that the resulting executable is valid.
On my platform, a Debian testing, both compilers give me a diagnostic without any option, so I guess your installation must be quite old and obsolete.
No, Yes
It's really very simple.
No, not standards legal. Yes, significant historical reasons.
C did not originally even have casts. As a system programming language, using it as an ultra-powerful glorified assembler was not only reasonable, it was best-practice and really "only-practice", back in the day.
A key piece of information should shine a light on things: it really is not the compiler's job to either implement or enforce the specification. The specification is, actually, literally, only a suggestion. The compiler's actual job is to compile all the C code that was ever written, including pre-C11, pre-C99, pre-C89, and even pre-K&R. This is why there are so many optional restrictions. Projects with modern code styles turn on strictly conforming modes.
There is the way C is defined in standards, and, there is the way C is used in practice. So you can see, the compiler simply can't refuse to build the code.
Over decades, developers have been shifting to portable, strictly conforming code, but when C first appeared, it was used a bit like a really amazingly powerful assembler, and it was kind of open season on address arithmetic and type punning. Programs in those days were mostly written for one architecture at a time.

How is it that "sizeof (char[0])" compiles just fine with GCC

A colleague of mine inserted a (void) sizeof (char[0]) at the end of a multi line, macro as an alternative to do {...} while (0) apparently. I have looked around but I can't find any reference to it and it surprises me that it even compiles.
Is it valid C? I would love a reference to the std.
If you compile with gcc -pedantic, you'll get a warning message:
warning: ISO C forbids zero-size array [-Wpedantic]
The latest draft of the C standard, N1570, section 6.7.6.2 (Array declarators) paragraph 1 says:
If the expression is a constant expression, it shall have a value
greater than zero.
This is part of a constraint, so char[0] is a constraint violation, requiring a diagnostic from any conforming C compiler.
(It's not 100% clear that that particular clause applies to the type name char[0] in the absence of a declared object, but the general idea is that standard C does not support zero-length arrays.)
gcc supports zero-sized arrays as an extension, documented here.
C11 6.5.3.4/2:
The sizeof operator yields the size (in bytes) of its operand, which may be an
expression or the parenthesized name of a type.
/4:
When applied to an operand that has array type, the result is the total number of bytes in the array.
C11 6.7.7 defines type name, especially /2:
In several contexts, it is necessary to specify a type. This is accomplished using a type name, which is syntactically a declaration for a function or an object of that type that omits the identifier.
So char[0] is a type name because it is syntactically a declaration for an object that omits the identifier. (Semantically it's invalid because zero-sized arrays are not allowed, but it is still a type name).
Based on these quotes I would say that sizeof is underspecified. 6.5.3.4/2 doesn't restrict type name to type names that would be a legal declaration if an identifier were included. 6.5.3.4/4 does say "the array" but nobody can agree on what "array" means in C anyway and I don't think this clearly implies anything about char[0].
Based on these quotes I would say that it is inconclusive.

Resources