The following code causes an error in Green Hills C compiler (error: type int * is incompatible with argument type const int*), while it only produces a warning and compiles with gcc (warning: passing argument 1 of ‘func’ discards ‘const’ qualifier from pointer target type).
void func1(int* a)
{
(*a)++;
}
const int g = 100;
void func2(void)
{
func1(&g);
}
Which behavior is according to C standard?
The call to func1(&g) is invalid. It is a constraint violation in C language, i.e. it is what in everyday terminology is usually referred to as an "error". C language does not support implicit conversion of const int * value to int * type.
The fact that it is "just a warning" in GCC does not mean anything. Use -pedantic-errors switch in GCC to make it report constraint violations as "errors". Green Hills C compiler, as you observed yourself, reports it as an "error" without any help from your side.
Which behavior is according to C standard?
Both compiler behaviors conform with the standard. As #AnT already explained, the call func1(&g) violates a language constraint. The standard's requirement on the compiler in such a case is expressed by paragraph 5.1.1.3/1:
A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint [...]
As you can see, #Olaf is right that the standard does not distinguish different categories of diagnostics. That's an invention and convention of implementations, which generally distinguish between the two based on whether the situation is fatal to the compilation process. The standard has nothing further to say about what a compiler should do so when a constraint violation is detected, however, neither in general nor in this particular case, so it does not even indirectly dictate whether a "warning" or "error" should be emitted.
If the compiler does continue and eventually produces an executable, then the whole resulting program exhibits undefined behavior if at any point in its run it evaluates the problematic function call. This is a reason why a compiler might choose not to emit such a binary (i.e. might consider the constraint violation an "error"), but, again, the standard does not provide any guidance there.
Related
First of all, I know this way of programming is not good practice. For an explanation of why I'm doing this, read on after the actual question.
When declaring a function in C like this:
int f(n, r) {…}
The types of r and n will default to int. The compiler will likely generate a warning about it, but let's choose to ignore that.
Now suppose we call f but, accidentally or otherwise, leave out an argument:
f(25);
This will still compile just fine (tested with both gcc and clang). However there is no warning from gcc about the missing argument.
So my question is:
Why does this not produce a warning (in gcc) or error?
What exactly happens when it is executed? I assume I'm invoking undefined behaviour but I'd still appreciate an explanation.
Note that it does not work the same way when I declare int f(int n, int r) {…}, neither gcc nor clang will compile this.
Now if you're wondering why I would do such a thing, I was playing Code Golf and tried to shorten my code which used a recursive function f(n, r). I needed a way to call f(n, 0) implicitly, so I defined F(n) { return f(n, 0) } which was a little too many bytes for my taste. So I wondered whether I could just omit this parameter. I can't, it still compiles but no longer works.
While optimizing this code, it was pointed out to me that I could just leave out a return at the end of my function – no warning from gcc about this either. Is gcc just too tolerant?
You don't get any diagnostics from the compiler because you are not using modern "prototyped" function declarations. If you had written
int f(int n, int r) {…}
then a subsequent f(25) would have triggered a diagnostic. With the compiler on the computer I'm typing this on, it's actually a hard error.
"Old-style" function declarations and definitions intentionally cause the compiler to relax many of its rules, because the old-style code that they exist for backward compatibility with would do things like this all the dang time. Not the thing you were trying to do, hoping that f(25) would somehow be interpreted as f(25, 0), but, for instance, f(25) where the body of f never looks at the r argument when its n argument is 25.
The pedants commenting on your question are pedantically correct when they say that literally anything could happen (within the physical capabilities of the computer, anyway; "demons will fly out of your nose" is the canonical joke, but it is, in fact, a joke). However, it is possible to describe two general classes of things that are what usually happens.
With older compilers, what usually happens is, code is generated for f(25) just as it would have been if f only took one argument. That means the memory or register location where f will look for its second argument is uninitialized, and contains some garbage value.
With newer compilers, on the other hand, the compiler is liable to observe that any control-flow path passing through f(25) has undefined behavior, and based on that observation, assume that all such control-flow paths are never taken, and delete them. Yes, even if it's the only control-flow path in the program. I have actually witnessed Clang spit out main: ret for a program all of whose control-flow paths had undefined behavior!
GCC not complaining about f(n, r) { /* no return statement */ } is another case like (1), where the old-style function definition relaxes a rule. void was invented in the 1989 C standard; prior to that, there was no way to say explicitly that a function does not return a value. So you don't get a diagnostic because the compiler has no way of knowing that you didn't mean to do that.
Independently of that, yes, GCC's default behavior is awfully permissive by modern standards. That's because GCC itself is older than the 1989 C standard and nobody has reexamined its default behavior in a long time. For new programs, you should always use -Wall, and I recommend also at least trying -Wextra, -Wpedantic, -Wstrict-prototypes, and -Wwrite-strings. In fact, I recommend going through the "Warning Options" section of the manual and experimenting with all of the additional warning options. (Note however that you should not use -std=c11, because that has a nasty tendency to break the system headers. Use -std=gnu11 instead.)
First off, the C standard doesn't distinguish between warnings and errors. It only talks about "diagnostics". In particular, a compiler can always produce an executable (even if the source code is completely broken) without violating the standard.1
The types of r and n will default to int.
Not anymore. Implicit int has been gone from C since 1999. (And your test code requires C99 because for (int i = 0; ... isn't valid in C90).
In your test code gcc does issue a diagnostic for this:
.code.tio.c: In function ‘f’:
.code.tio.c:2:5: warning: type of ‘n’ defaults to ‘int’ [-Wimplicit-int]
It's not valid code, but gcc still produces an executable (unless you enable -Werror).
If you add the required types (int f(int n, int r)), it uncovers the next issue:
.code.tio.c: In function ‘main’:
.code.tio.c:5:3: error: too few arguments to function ‘f’
Here gcc somewhat arbitrarily decided not to produce an executable.
Relevant quotes from C99 (and probably C11 too; this text hasn't changed in the n1570 draft):
6.9.1 Function definitions
Constraints
[...]
If the declarator includes an identifier list, each declaration in the declaration list shall
have at least one declarator, those declarators shall declare only identifiers from the
identifier list, and every identifier in the identifier list shall be declared.
Your code violates a constraint (your function declarator includes an identifier list, but there is no declaration list), which requires a diagnostic (such as the warning from gcc).
Semantics
[...] If the
declarator includes an identifier list, the types of the parameters shall be declared in a
following declaration list.
Your code violates this shall rule, so it has undefined behavior. This applies even if the function is never called!
6.5.2.2 Function calls
Constraints
[...]
If the expression that denotes the called function has a type that includes a prototype, the
number of arguments shall agree with the number of parameters. [...]
Semantics
[...]
[...] If the number of arguments does not equal the number of parameters, the
behavior is undefined. [...]
The actual call also has undefined behavior if the number of arguments passed doesn't match the number of parameters the function has.
As for omitting return: This is actually valid as long as the caller doesn't look at the returned value.
Reference (6.9.1 Function definitions, Semantics):
If the } that terminates a function is reached, and the value of the function call is used by
the caller, the behavior is undefined.
1 The sole exception seems to be the #error directive, about which the standard says:
The implementation shall not successfully translate a preprocessing translation unit
containing a #error preprocessing directive unless it is part of a group skipped by
conditional inclusion.
I was answering a question and noticed something that seemed odd. The code in question was more complicated, but the observation boils down to the fact that this compiles in MSVC 14.0:
#include <stdio.h>
void foo(int);
int main()
{
foo(66);
getchar();
}
void foo(const char* str, int x)
{
printf("%s %d\n", str, x);
}
This code produces undefined behavior, because the value of str in foo is 66, which doesn't point to a valid null-terminated string, so in practice we (most likely) get a segfault.
As stated, I used Visual Studio 2015 - MSVC 14.0 - to compile this. I'm compiling as the code as C. If we try GCC 5.1, it fails.
At first I thought this was one of the weird thing C allowed in its early days and has been left in it not to break old code (such as implicit function prototypes). But here, we have a prototype AND a definition with the same name, yet they are not compatible. How come this is not rejected by the compiler? C doesn't allow function overloading, so two identifiers with identical names should not be legal.
Why does MSVC not reject this code? Is there an explanation for this behavior? Is this explicitly allowed in one of the standards?
EDIT: Because there seems to be much confusion in the comments, I would like to clarify. I know how to avoid these kinds of mistakes, I always compile with the maximum warning level and treat warnings as errors; this is not the point here. The question is purely theoretical: I want to know whether this behavior is legal and defined in the C Standard. Because two C compilers behave differently when given the same code, something is wrong.
According to C11 (N1570 draft) 6.7/4 Declarations (within Constraints section):
All declarations in the same scope that refer to the same object or
function shall specify compatible types.
The definition serves also as declaration, hence the language constraint is violated.
For that, the conforming implementation is obligated to produce a diagnostic message, that is in implementation-defined manner, as of 5.1.1.3 Diagnostics:
A conforming implementation shall produce at least one diagnostic
message (identified in an implementation-defined manner) if a
preprocessing translation unit or translation unit contains a
violation of any syntax rule or constraint (...)
That's all. A diagnostic message may be of any kind, they may even send you a letter, if they like so.
#include<stdio.h>
int main()
{
const int a=1;
int *p=(int *)&a;
(*p)++;
printf("%d %d\n",*p,a);
if(a==1)
printf("No\n");//"No" in g++.
else
printf("Yes\n");//"Yes" in gcc.
return 0;
}
The above code gives No as output in g++ compilation and Yes in gcc compilation. Can anybody please explain the reason behind this?
Your code triggers undefined behaviour because you are modifying a const object (a). It doesn't have to produce any particular result, not even on the same platform, with the same compiler.
Although the exact mechanism for this behaviour isn't specified, you may be able to figure out what is happening in your particular case by examining the assembly produced by the code (you can see that by using the -S flag.) Note that compilers are allowed to make aggressive optimizations by assuming code with well defined behaviour. For instance, a could simply be replaced by 1 wherever it is used.
From the C++ Standard (1.9 Program execution)
4 Certain other operations are described in this International
Standard as undefined (for example, the effect of attempting to
modify a const object). [ Note: This International Standard imposes
no requirements on the behavior of programs that contain undefined
behavior. —end note ]
Thus your program has undefined behaviour.
In your code, notice following two lines
const int a=1; // a is of type constant int
int *p=(int *)&a; // p is of type int *
you are putting the address of a const int variable to an int * and then trying to modify the value, which should have been treated as const. This is not allowed and invokes undefined behaviour.
For your reference, as mentioned in chapter 6.7.3, C11 standard, paragraph 6
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. If an attempt is
made to refer to an object defined with a volatile-qualified type through use of an lvalue
with non-volatile-qualified type, the behavior is undefined
So, to cut the long story short, you cannot rely on the outputs for comaprison. They are the result of undefined behaviour.
Okay we have here 'identical' code passed to "the same" compiler but once
with a C flag and the other time with a C++ flag. As far as any reasonable
user is concerned nothing has changed. The code should be interpreted
identically by the compiler because nothing significant has happened.
Actually, that's not true. While I would be hard pressed to point to it in
a standard but the precise interpretation of 'const' has slight differences
between C and C++. In C it's very much an add-on, the 'const' flag
says that this normal variable 'a' should not be written to by the code
round here. But there is a possibility that it will be written to
elsewhere. With C++ the emphasis is much more to the immutable constant
concept and the compiler knows that this constant is more akin to an
'enum' that a normal variable.
So I expect this slight difference means that slightly different parse
trees are generated which eventually leads to different assembler.
This sort of thing is actually fairly common, code that's in the C/C++
subset does not always compile to exactly the same assembler even with
'the same' compiler. It tends to be caused by other language features
meaning that there are some things you can't prove about the code right
now in one of the languages but it's okay in the other.
Usually C is the performance winner (as was re-discovered by the Linux
kernel devs) because it's a simpler language but in this example, C++
would probably turn out faster (unless the C dev switches to a macro
or enum
and catches the unreasonable act of taking the address of an immutable constant).
Why this below program not throwing an error:
dfljshfksdhfl;
#include <stdio.h>
int main () {
return 0;
}
gcc would just throw a warning:
test.c:1:1: warning: data definition has no type or storage class [enabled by default]
This is because even though implicit int is no longer part of the C standard since C99 some compilers still support it, mainly to prevent breaking a lot of old code. So this line:
dfljshfksdhfl;
ends up being equivalent to:
int dfljshfksdhfl;
clang gives us a much more informative warning by default:
warning: type specifier missing, defaults to 'int' [-Wimplicit-int]
dfljshfksdhfl;
^~~~~~~~~~~~~
We can use the -pedantic-errors flag to turn this into an error, although oddly enough this does not work for clang and so we have to resort to -Werror and turn all warnings into errors, which is actually a good habit to get into. As remyabel points out for clang we can also use -Werror=implicit-int.
I've already answered a similar question (actually I'm pretty sure it's a duplicate, but whatever) and the answer is found in the C99 rationale.
A new feature of C99:
In C89, all type specifiers could be omitted from the declaration
specifiers in a declaration. In such a case int was implied. The
Committee decided that the inherent danger of this feature outweighed
its convenience, and so it was removed. The effect is to guarantee the
production of a diagnostic that will catch an additional category of
programming errors. After issuing the diagnostic, an implementation
may choose to assume an implicit int and continue to translate the
program in order to support existing source code that exploits this
feature.
#Shafik's answers tells you one way to turn the warning into an error (for Clang). If you consider -Werror to be too strict, you can turn that one warning into an error with -Werror=implicit-int. In GCC, it appears that -pedantic-errors is necessary.
First of all, gcc is not a conforming C compiler by default. It implements a dialect of C89/C90 with GNU extensions.
You can use -std=cNN -pedantic (where NN can be 90, 99, or 11) to cause it to (attempt to) conform to a specified version of the ISO C standard. C90 permitted implicit int; it was dropped in C99.
But C compilers are not actually required to generate fatal error messages (except for a #error directive). The standard's requirement (N1570 5.1.1.3p1) is:
A conforming implementation shall produce at least one diagnostic
message (identified in an implementation-defined manner) if a
preprocessing translation unit or translation unit contains a
violation of any syntax rule or constraint, even if the behavior is
also explicitly specified as undefined or implementation-defined.
Diagnostic messages need not be produced in other circumstances.
A non-fatal warning qualifies as a "diagnostic message". A conforming C compiler can print a warning for any error -- even a syntax error -- and then continue to successfully compiler the source file. (This is how some compiler-specific language extensions may be supported.)
Personally, I find gcc to be overly lax about certain errors; in my opinion, a missing int should be treated as a fatal error. But that's just my preference, not a requirement imposed by the standard.
The lesson here is that you should not assume that mere warnings are harmless. Ideally, compiling your code should produce no diagnostics at all. Cases where it's ok to ignore warnings are rare (but they do exist, since compilers are free to warn about perfectly valid code).
Before anyone may mark it duplicate of related questions. I emphasize I DO have read all those questions. But I still have some interrogations(yep, some little pedantic :) )
For C
Some conclusions:
1. In C89(C90), this is _undefined_ .
2. In C99(or C11), a type of int is madatory; control flow reached the closing }
will return a value of 0.
Here comes my interrogations.
In c89, I have found nothing about undefined, but unspecified?
Detail: The related parts in C89 are 5.1.2.2.1 Program startup and 5.1.2.2.3 Program termination (NOTE : both are under the 5.1.2.2 Hosted environment section, within which our later discussion is limitted)
Cite: -- 5.1.2.2.3 Program termination --
A return from the initial call to the main function is
equivalent to calling the exit function with the value
returned by the main function as its argument.10 If the }
that terminates the main function is reached, the
termination status returned to the host environment is
unspecified.
Just note that part: If the } that terminates ... , it clearly says
that if we omit the return type - thus the } will be reached at -
the termination status is unspecified
According the definition of the standard of undefined and unspecified,
Should I say that it gives unspecified value since whatever it return is a
legal int value, but the consequese is undefined-we could not predict what value
will lead to what catastrophic consequese?
In c99, a type of int is madatory, but gcc --std=c99 given a test without int type(no return type actually) gives only waring:return type of ‘main’ is not ‘int’ ,but not error ?
Detail: the related parts are the same as that in c89.
Cite: -- 5.1.2.2.1 Program startup --
It shall be defined with a return type of int and ...
and -- 4. Conformance --
1. In this International Standard, ‘‘shall’’ is to be interpreted as a requirement on an implementation or on a program; conversely, ‘‘shall not’’ is to be interpreted as a
prohibition.
So shall should be interpreted as madatory in this standard, why gcc with swith --std=c99 violated this?
C89/90 still has the implicit int rule, so main() is equivalent to int main(). By leaving off the return type, you've implicitly defined the return type as int. Some may consider this sloppy, but it's strictly conforming (i.e., involves no implementation defined, undefined or unspecified behavior).
For C99, the implicit int rule has been removed, so main() isn't defined. However, a compiler is only required to "issue a diagnostic" upon encountering a violation of a Shall or Shall not clause -- it's still free to continue compiling after issuing the diagnostic. Exactly what constitutes a diagnostic is implementation defined. As such, all that's needed for gcc to conform in this respect is documentation to say that the warning it issues is to be considered a diagnostic.
Edit: "implicit int" in the C89/90 standard isn't really a single rule in a single place -- it's spread around in a couple of places. The primary one is §6.5.2.1, where it says:
-- int, signed, signed int, or no type specifiers
This is part of a list, where all the items on each line of the list are considered equivalent, so this is saying that (unless otherwise prohibited) lack of a type specifier is equivalent to specifying (signed) int.
For function parameters, there's a separate specification (at §6.7.1) that: "Any parameter that is not declared has type int."
Even in case of a constraint violation, the only thing a C compiler must issue is a "diagnostic". It is then allowed to continue and to produce an executable program.
Just note that part: If the } that terminates ... , it clearly says that if we omit the return type
No, it doesn't. It says what happens when you omit the return exitStatus; from the end of main().
According the definition of the standard of ** undefined ** and unspecified, Should I say that it gives unspecified value since whatever it return is a legal int value, but the consequese is undefined-we could not predict what value will lead to what catastrophic consequese.
No. It means that you don't know what the return status code of your program will be. The behavior is, however, not undefined: your program terminates. With what kind of result code, that's a different question.
[...] gives only waring: return type of ‘main’ is not ‘int’, but not error?
That's how it's implemented. In old C (C89) - and with some newer compilers as well - if you omit the return type of a function, it is assumed to be int (so even the warning looks a bit problematic).
So shall should be interpreted as madatory in this standard, why gcc with swith --std=c99 violated this?
Probably yes. Note that GCC is a non-conformant implementation except if you use -ansi -pedantic, so in theory, any program compiled without these flags has undefined behavior. But that's theory...
It would seem that this is some sort of C90 standard reference mistake. I have no idea whether the document you link to is equivalent to the actual ISO C90 standard or not.
Apparently, this particular issue has changed from undefined, to unspecified, to well-defined over the years.
In early drafts of ANSI-C "for the X3.J11 working group", you will find the following text:
2.1.2.2 Hosted environment
"Program termination"
A return from the initial call to the main function is equivalent
to calling the exit function with the value returned by the main
function as its argument. If the main function executes a return that
specifies no value, the termination status returned to the host
environment is undefined.